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(54) Title: METHODS AND REAGENTS FOR MOLECULAR CLONING 

(57) Abstract: The present invention provides compositions, methods, and kits for covalently linking nucleic acid molecules. The 
methods include a strand im^asion step, and the compositions and kits are useful for performing such methods. For example, a method 
of covalently linking double stranded (ds) nucleic acid molecules can include contacting a first ds nucleic acid molecule, which has 
a topoisomerase linked to a 3' terminus of one end and has a single stranded 5' overhang at the same end, with a second ds nucleic 
acid molecule having a blunt end , such that the 5* overhang can hybridize to a complementary squence of the blunt end of the second 
nucleic acid molecule, and the topoisomerase can covalently link the ds nucleic acid molecules. The methods are simpler and more 
efficient than previous methods for covalently linking nucleic add sequences, and the compositions and kits facilitate practising the 
methods, including methods of directionally linking two or more ds nucleic acid molecules. 
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METHODS AND REAGENTS FOR MOLECULAR CLONING 

BACKGROUND OF THE INVENTION 

FIELD OF THE INVENTION 

The invention relates g^erally to compositions and methods for &cilitating &e 
construction of recombinant nucleic acid molecules, and more specifically to 
compositions useful for covalently linking two or more nucleic acid molecules^ including 
for directionally or non-directionally linking the nucleic acid molecules, and to methods 
S of generating such covalently linked recombinant nucleic acid molecules. 

BACKGROUND INFORMATION 

The ability to clone large numbers of nucleotides sequences, including gene 
sequences and open reading frames allows a great deal of infonnation to be obtained 
about gene expression and the regulation thereof. In addition, such sequences can be 

10 useful for understanding the etiology of disease conditions and, ideally, can provide a 
means to diagnose and treat such diseases. However, while it is relatively sinq)le matter 
to clone large numbers of expressed nucleotides sequences, for example, it is a more 
difficult undertaking to characterize the regulatory elements involved in the expression of 
such sequence and to properly express a polypeptide encoded by the sequence. In 

IS particular, there is a need for improved methods for ligating nucleic acid molecules and 
cloning nucleic acid molecules such that a functional recombinant nucleic acid molecule 
is produced. There is a particular need for directional cloning methods, wherein an insert 
can be cloned into a vector or linked to one or more othCT nucleic acid molecules in a 
predetermined orientation. 

20 The use of topoisomerases provides a convenient means to improve cloning and 

ligation methods. For example, fiie use of topoisomorase to perform rspid ligation of 
polymerase chain reaction (PGR) products into a vector has cut traditionally laborious 
cloning methods down to a five minute procedure. As such, topoisomerase is particularly 
usefid for hi^ throu^put cloning applications. However, given the current demand for 

25 expressing open reading frames (ORF) in genome scale molecular cloning procedures, 
thare still remains a need to better control the orientation in which two or more nucleic 
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acid molecules are linked such that functional recombinant nucleic acid molecules such as 
expressible cloned nucleic acid molecules can be prepared. 

Expression of cloned ORFs demands that the PGR product be inserted into the 
vector in its correct orientation, so as to work in accord with functional expression 
5 domains located on the vector. In the current state of the art for topoisomerase mediated 
cloning, ORFs are amplijaed by PGR usmg various DNA polymerases. A polymerase 
such at Taq, which does not have a proof-reading function and has an inherent terminal 
transferase activity, is commonly used, and produces PGR products containing a single, 
non-template derived 3' A ovCThang at each end. These amplification products can be 

10 efBciently cloned into topoisomerase-modified vectors containing a single 3* T oveifaang 
at each end (TOPO TA Cloning® Kit, Invitrogen Corp., Carlsbad, CA). Jn comparison, a 
polymerase such as pfu, which has an inherent 3' to 5' exonuclease proof-reading activity, 
produces PGR products that are blunt-mded. Topoisomerase-modified vectors containing 
blunt ends are available for cloning of PGR products produced with proofireading 

15 polymerases (Zero Blunt TOPO® PGR Cloning Kit, Invitrogen Corp., Carlsbad, CA). 
Incubation of either PGR product and the proper topoisomerase-modified vector results in 
five minute ligation. However, tiie orientation of the insert obtained using such cloning 
methods is random. 

Because the orientation of DNA firagment insertion into topoisomerase-modified 
20 cloning vectors is random, users must screen clones to identify those having the proper 
orientation. Insert orientation can be determined using various methods including, for 
example, restriction ^izyme analysis, in vitro transcription &om vector-encoded promoter 
elements, and PGR using, for example, one insert-q>ecific primer and one vector-specific 
primer. As is evident, however, the requirement for determining insert oriratation 
25 requires an investment of time and can substantially increase the cost for identifying a 
nucleic acid molecule of interest, particularly where a hi^ throu^put cloning method is 
used. As such, current cloning methods are severely limited, particularly for hi^ 
throu^put gcac expression analysis for several reasons, because numerous laborious 
stqjs must be performed in order to select clones with correctly oriented inserts, and there 
30 is a need to screen as many as eig;ht colonies of each clone to identify one having the 
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proper orientation. Thus, a need exists for methods and reagents that are useful for 
covalently linking two or more nucleic acid molecules in a directional orientation. The 
present invention satisjSes this need and provides additional advantages. 

SUMMARY OF THE INVENTION 

5 The present invention provides compositions and methods for covalently linking 

two or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) double stranded ("ds") nucleic acid 
molecules, including directionally or non-directionally linking two or more ds nucleic 
acid molecules. Nucleic acid molecules used m apcordance with the invention preferably 
comprise a first end and a second end. The first and/or second end of such molecules 

10 preferably has a 5' and/or 3' extension or overhang. Thus, one or both ends of flie nucleic 
acid molecules used in the invention can have a 3' and/or 5* overhang. The overhang 
sequences can be the same or different sequences, and can be the same or different types 
(e.g., 3' or 5* overhang) at both ends of the molecule. In addition, while one end of the 
nucleic acid molecule can have a 3' extension and 5' extension, the other end of the 

15 molecule can, but need not, have an extension. In some aspects, one end of nucleic acid 
molecule can contain a 3' overhang or 5* overhang while the other end can be blunt ended 
(i.e., it has no overhang), lii accordance with the invention, the 3* and/or 5' extension 
sequences (i.e., overhangs) at any terminus can be any length (i.e., any number of 
nucleotides), and can have any sequence. Thus, the invention relates to nucleic add 

20 molecules having single or multiple nucleotide overhangs. In some aspects, the nucleic 
acid molecules and their termini can include modified or labeled nucleotides. In the use 
of the invention, ^izymes or proteins enable of fiising or joining or ligating nucleic acid 
molecules can be used. Thus, two or more nucleic acid molecules, which can be the same 
or different, can be joined directionally using such enzymes. Such enzymes or proteins 

25 include, but are not lunited to, topoisomerases (includmg types lA, IB, n, etc.), 

recombinase proteins (including FLIP recombinase, Int integrase, ere recombinase, etc.), 
and Ugases (including T4 DNA ligase, etc.). 



30 



In the methods of the invention, the 3' or 5' overhang of one terminus of the first 
nucleic acid molecule can have homology (or is complementary) to at least one sequence 
at or near the t^minus of at least a second nucleic acid molecule. Thus, through base 
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pairing or hybridization of the 3' or 5* overhang or extension with the homologous or 
complementary sequence on the second molecule, the invention allows directional or non- 
directional association or joining of two difforent molecules. In a preferred aspect, the 
3' or 5* overhang of one terminus of at least a first molecule can engage in strand invasion 
S as it associates or hybridizes with its complementary sequence at or near the terminus of 
the second molecule. In one aspect, such a strand invasion event allows the 3' or 
S* overhang to directionally associate with a desired end of its partner molecule. By 
designing the overhangs and the termini of the molecules to be joined, two or multiple 
partner molecules can be joined in the presence of one or more proteins or enzymes 

10 having ligase activity (e.g., topoisom«:ases, ligases, recombinases, etc.) in accordance 
witii the inventioiL . Thus, the invention provides methods for coimecting two or more 
nucleic acid molecules (e.g., double stranded nucleic acid molecules) which involve 
covalently linking at least one strand of one molecule to at least one strand of another 
molecule. The invention further provides compositions for preparing nucleic acid 

1 5 molecules connected by methods of the invention and compositions produced by methods 
of the invention. 

Processes of the invention are exemplified by methods described herein which 
involve the covalent linkage of strands of different nucleic acid molecules catalyzed by 
topoisomerase. Thus, the present invention relates, in part, to an isolated ds nucleic acid 

20 molecule having a first end and a second end, wherein the first end contains a first 

5' overhang and a first topoisomerase covalently bound at the 3' terminus, and the second 
end contains a second topoisomerase covalently bound at the 3' terminus and contains a 
second 5' overhang, a blunt end, or a 3' thymidine overhang, wherein the first 5' ov^hang 
is different fi^om the second 5* overhang. The first topoisomerase and second 

25 topoisomerase can be tiie same or different The first 5* overhang can have any nucleotide 
sequence, including, for example, flie nucleotide sequence S-GGTG-3'. 

In one embodiment, tiie ds nucleic acid molecule is a vector, which can be a linear 
vector such as a lambda vector or a linearized vector such as a linearized plasmid. The 
vector can be a cloning vector or an expression vector, and can contain, for example, one 
30 or more (e.g., 1, 2, 3, 4, 5, 6, etc.) recombinase recognition sites such as one or more lox 
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sites or one or more att sites, one or more transcriptional regulatory elements, one or more 
translational regulatory elements, one or more nucleotide sequences encoding a peptide of 
interest such as one or more selectable markers or one or more tags, or combinations 
thereof. For example, the vector can be a pUniA^S-His version A (SEQ ID NO: 16) vector 
5 or apCR®2.1 (SEQ ID N0:17) vector. 

The present invention also relates to methods of directionally or non-directionally 
linking two, three, four or more nucleic acid molecules, including, as desired, operatively 
linking two or more of the nucleic acid molecules. A mefliod for generating a 
directionally linked recombinant nucleic acid molecule can be performed, for example, by 

1 0 contacting a first topoisomerase-chafged first ds nucleic acid molecule, which has a first 
topoisomerase covalently bound at a first end, and a second topoisomerase covalently 
bound at a second end, and also contains a S* overhang at the first end and ablunt end, a 
3* uridine overhang, a 3' thymidine overhang, or a second S' overhang at the second end; 
and at least a second ds nucleic acid molecule, which has a first blunt end and a second 

15 end, wherein the first blunt end has 5* nucleotide sequence that is complementary to flie 
first 5' overhang of the first end of the first nucleic acid molecule. The first and second 
topoisomerases can be the same, for example, two type IB topoisomerases such as two 
Vaccinia type IB topoisomerases, or can be different, including two type IB 
topoisomerases fi-om different organisms or a type IB topoisomerase and a type lA or a 

20 type n topoisomerase. 

Iq p^oiming a method of the invention, the first and second (or other) ds nucleic 
acid molecules are contacted under conditions such that the 5' nucleotide sequence of the 
first blunt end of the second nucleic acid molecule can selectively hybridize to the first 
5' overhang, whereby the first topoisomerase can covalently link the 3* terminus of the 
25 first end of the first ds nucleic acid molecule to the 5' temiinus of the first blunt end of the 
second ds nucleic acid molecule, and the second topoisomerase can covalently link the 
3* terminus of the second end of the first ds nucleic acid molecule to the 5' temiinus of the 
second end of the second ds nucleic acid molecule, to generate a directionally linked 
recombinant nucleic acid molecule. Accordingly, the present invention provides a 
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directionally or non-directionally linked recombinant nucleic acid molecule produced by 
such a method. 

Jn one aspect of performing a method of the invention, the second end of the first 
topoisomerase-charged ds nucleic acid molecule has a blunt end, and the second end of 
5 the second ds nucleic acid molecule has a blunt end. In another aspect, the second end of 
the topoisomerase-charged first ds nucleic acid molecule has a 3' thymidine overhang, 
and the second end of the second ds nucleic acid molecule has a 3' adenosine overhang, or 
the second end of the topoisomerase-charged first ds nucleic acid molecule has a 
3' uridine (or modified form thereof, for example, deoxyuiidine) overhang, and the 

10 second end of the second ds nucleic acid molecule has a 3* adenosine overhang. In yet 
another aspect, the topoisomerase-charged first ds nucleic acid molecule has a second 
S' overhang at the second end, and flie second end of the second ds nucleic acid has a 
nucleotide sequoice complementary to the second 5' overhang. The topoisomerase- 
charged first ds nucleic acid molecule can, but need not be, a vector, mcluding a cloning 

1 5 vector or an expression vector. 

A method of flie invention can finlher include introducing a directionally or 
non-directionally-linked recombinant nucleic acid molecule into a cell, which can be a 
prokaryotic cell such as a bacterium or a eukaryotic cell such as a Tnammflli^n cell. 
Accordingly, the present invention also provides a cell produced by a method of the 
20 invention, as well as a non-human transgenic organism produced firom such a cell. 

The topoisomerase-charged first ds nucleic acid molecule can be a vector, and the 
second ds nucleic acid molecule used in a method of the invention can be an 
amplification product In addition, the second ds nucleic acid molecule can be one of a 
plurality of second ds nucleotide molecules, for example, individual members of a cDNA 
25 library or a combinatorial library. 

A method for generating a directionally or non-directionally linked recombinant 
nucleic acid molecule also can be pCTformed, for example, by contacting a first precursor 
ds nucleic acid molecule having a first end, which has a first 5' target sequence at flie 
S' terminus and a topoisomerase recognition site at the 3' terminus, and a second end. 
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which has a topoisomerase recognition site at the 3* terminus; a second ds nucleic acid 
molecule having a first blunt end and a second end, wherein the first blunt end has a 
5* nucleotide sequence complementary to the S* target sequence of the first precursor ds 
nucleic acid molecule; and a topoisomerase that is specific for fhe topoisomerase 

S recognition site. The first ds nucleic acid molecule, second ds nucleic acid molecule and 
topoisomerase are contacted under conditions that allow topoisomerase activity, i.e., such 
that fhe topoisomerase can bind to and cleave the recognition site, to produce a 
topoisomerase-charged 3' terminus, and can ligate the 3' t^minus to an appropriate 
S' terminus. Such conditions also allow hybridization of the portion of the first 5' target 

10 sequence that remains following cleavage by the topoisomerase and the S' nucleotide 
sequence of the first blxmt end of the second ds nucleic acid molecule, wherein the 
S' nucleotide sequence of the first blunt end is complementary to that portion of the 
5* target sequence. 

In one aspect of performing a method of fiie invention, the second end of the first 
IS precursor ds nucleic acid molecule is a blunt end upon cleavage by the topoisomerase, 
and the second end of the second ds nucleic acid molecule is a blunt end. In another 
aspect, the second end of the first precursor ds nucleic acid molecule has a 3' thymidine 
extension upon cleavage by the topoisomerase, and the second end of the second ds 
nucleic acid molecule comprises a 3' adenosine or 3 -uridine, for example, deoxyuridine 
20 overhang. In yet another aspect, the first precursor ds nucleic acid molecule has a second 
5* target sequence at the second end, and the second end of the second ds nucleic acid 
molecule has a 5' nucleotide sequence complementary to at least a portion of the second 
5' target sequence. 

The first precursor ds nucleic acid molecule can be a vector, including a cloning 
25 vector and an expression vector, and, where the vector generally is available in a circular 
form, can be linearized due to the action of the topoisomerase, or can be linearized by 
including, for example, one or two restriction endonucleases that linearize the vector such 
that, iq)on contact with tfie topoisomerase, the first and second ds nucleic acid molecules 
can be directionally or non-directionally linked according to a method of the invention. 
30 The present invention also provides a directionally or non-directionally linked 



wo 02/16594 



PCT/USOl/26294 



8 

recombinant nucleic acid molecule produced according to a method of the invention, 
which can further include, for example, a step of introducing the directionally-linked 
recombinant nucleic acid molecule into a cell. Accordingly, the present invention also 
provides a cell containing such a directionally or non-directionally linked recombinant 
5 nucleic acid molecule, as well as a transgenic non-human organism generated from such a 
cell. 

The first precursor ds nucleic acid molecule can include one or more (e.g., 1, 2, 3, 
4, 5, 6, 7, etc.) expression control elements, which can be operatively linked to each other, 
and the second ds nucleic acid molecule can encode all or a portion of an open reading 
10 firame, wherem fhe expression control element is operatively linked to the opm reading 
firame in a directionally linked recombinant nucleic acid molecule generated according to 
a method of the invention. In addition, the second ds nucleic acid molecule can be one of 
a plurality of second ds nucleic acid molecules, for example, individual members of a 
cDNA library. 

15 A method for generating a directionally linked recombinant nucleic acid molecule 

also can be p^formed by contacting a topoisomerase-charged first ds nucleic acid 
molecule, which has, at a first end, a first S' overhang and a first topoisomerase covalently 
boimd to the 3' terminus, and a second ds nucleic acid molecule, which has a first blunt 
end and a second end, wherein the first blunt end includes a 5' nucleotide sequence 

20 complementary to the fiirst 5' overhang. The method is performed under conditions such 
that the S' nucleotide sequence of the first blunt end can selectively hybridize to the first 
5* overiiang, whereby the first topoisomerase can covalently link the 3' terminus of the 
first end of the first ds nucleic acid molecule with the S' terminus of the first end of the 
second ds nucleic acid molecule. 

25 Such a method can fiirther include contacting the topoisomerase-charged first ds 

nucleic acid molecule and the second ds nucleic acid molecule wifli a third ds nucleic acid 
molecule, wherein a first end of the third nucleic ds acid molecule has a 5' overhang and a 
second topoisomerase covalently bound at the 3' terminus, and wherein the second ds 
nucleic acid molecule has a second blunt end, which includes a 5* nucleotide sequrace 

30 complementary to the second 5' overhang. The contacting can be performed, for 
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example, under conditions such that the 5' nucleotide sequence of the second blunt oad of 
the second ds nucleic acid can selectively hybridize to the 5' oveifaang of the first end of 
the third ds nucleic acid molecule, whereby the second topoisomerase can covalently link 
the 3* terminus of the first end of the third ds nucleic acid molecule with the 5' terminus of 
S the second blunt end of the second ds nucleic acid molecule. Similarly, the metiiod can 
be used to directionally or non-directionally link a fourth, fifth, sixth, or more ds nucleic 
acid molecules, wherein the ends of such ds nucleic acid molecules are selected as 
exemplified herein. The first and second (or other) topoisomerases can be the same or 
dififerent and, if desired, the first or third ds nucleic acid molecules, instead of being 
1 0 topoisomerase-charged, can contain a topoisomerase recognition site, wherein the method 
can fiirtfaer include contacting the reactants with a topoisomerase. 

A method of the invention can be performed simultaneously or sequentially. A 
method of the invention can be performed sequentially, for example, such that the first 
ds nucleic acid molecule is directionally linked to the second ds nucleic acid molecule 
1 S and, at a later time or in a difTer^t reaction vessel, the third ds nucleic acid molecule is 
directionally linked to the second ds nucleic acid molecule. Alternatively, the method can 
be performed simultaneous, wherein all of the reactants are included together at the same 
time. 

Methods of the invention are particularly usefiil for openatively linking two or 
20 more (e.g., 2, 3, 4, 5, 6, 7, 8, etc.) ds nucleic acid molecules, including, for exanqple, 
operatively linking an e^^ression control element to an open reading firame, or 
operatively linking a first and second open reading fi:ame to generate a recombinant 
nucleic acid molecule encoding a fiision protein, which can be further operatively linked 
to one or more expression control element For example, in practicing a method of the 
25 invention, a first ds nucleic acid molecule can include an expression control element, a 
second ds nucleic acid molecule can racode an open reading firame, and a third ds nucleic 
acid molecule can encode a peptide, whaein, in the directionally linked recombinant 
nucleic acid molecule, the expression control element is operatively linked to the open 
reading fi^me, and the second ds nucleic acid molecule is operatively linked to the third 
30 ds nucleic acid molecule, and wherein the operatively linked second and third ds nucleic 
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acid molecules encode a fixsion protein comprising the open reading frame and the 
peptide. The peptide can be any peptide or polypeptide, including a gene product or other 
open reading frame, a tag (e.g., an aflSnity tag), a detectable label, and/or the Uke. 

The present invention also relates to a composition, which includes a first 
5 ds nucleic acid molecule having a first end and a second end, wherein the first end has a 
5' overhang and a topoisomerase covalently bound at the 3' terminus; and a second ds 
nucleic acid molecule having a first blunt end and a second end, wherein the first blunt 
end has a first 5' nucleotide sequence, which is complementary to the first 5'-overhang, 
and a first 3' nucleotide sequence complementary to the first 5* nucleotide sequence. In 

10 such a composition, the first 5* nucleotide sequence of the first blunt end of the second ds 
nucleic acid molecule can be hybridized to the first 5* overhang of the first end of the first 
nucleic acid molecule, wherein the first 3' nucleotide sequence of the first blunt end of the 
second ds nucleic acid molecule is displaced. The first ds nucleic acid molecule in such a 
composition can further have a second 5' overhang at the second end, and the second end 

IS of the second ds nucleic acid molecule can further include a second 5' nucleotide 

sequence, which is complemmtary to the second 5' oyeifaang, and a second 3' nucleotide 
sequence complementary to the second 5* nucleotide sequence. 

The present invention also relates to kits, which contain one or more reagents 
useful for dkectionally linking ds nucleic acid molecules. In one embodiment, a kit of the 

20 invention contains a ds nucleic acid molecule having a first ead and a second end, 
wherein the first end contains a first 5* overhang and a first topoisomerase covalently 
bound at the 3' terminus, and the second end contains a second topoisomerase covalently 
bound at die 3' traninus and contains a second 5' overhang, a blunt end, or a 3' thymidine 
overhang, wherein the first 5' overhang is different from the second 5' overiiang. The 

25 topoisomerases can be the same or different, and the ds nucleic acid molecule can be a 
vector, and can contain an expression control element. 

In another embodimrat, a kit of the invention contains a first ds nucleic acid 
molecule, which has a first topoisomCTase covalratly bound at a 3* terminus of a first end, 
and a second topoisomerase covalently bound at a 3' terminus of a second end, wherein 
30 ttie first end also has a fibrst 5* overhang and the second end also has a blunt end, a 
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3* thymidine overhang, or a second 5' overhang, wherein, when present, the second 
5* overhang is diiSerent from the first 5' overhang; and a plurality of second ds nucleic 
acid molecules, wherein each ds nucleic acid molecule in the plurality has a first blunt 
end, and wherein the first blunt end mcludes a 5' nucleotide sequence complementary to 
5 the first 5' overhang of the first ds nucleic acid molecule. The second ds nucleic acid 
molecules in the plurality can be a plurality transcriptional regulatory elements, 
translational regulatory elements, or a combination thereof, or can encode a plurality of 
peptides such as peptide tags, cell comparbnentalization domains, and the like. 

A kit of the invention can contain one or more (e.g,, 1, 2, 3, 4, 5, 6, 7, 8, etc.) 

10 topoisomerase-charged ds nucleic acid molecules of the invention, for example, one or 
more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, etc.) topoisomerase-charged vectors; one or more (e.g., 1, 
2, 3, 4, 5, 6, 7, 8, etc.) precursor ds nucleic acid molecules, which can be contacted with a 
topoisomerase to produce a topoisomerase-charged ds nucleic acid molecule of the 
invention; or a combination thereof. The kit also can contain one or more primers or 

1 5 primer pairs, for example, for preparing one or a plurality of second ds nucleic acid 

molecules using an amplification reaction; one or more control ds nucleic acid molecules 
to test or standardize the components of the kit; one or more cells, which can be, for 
example, competent cells into which a recombinant nucleic acid molecule generated 
according to a method of the invention can be introduced; one or more (e.g., 1, 2, 3, 4, 5, 

20 6, 7, 8, etc.) reaction buffers for performing a method of the invention; instructions for 
carrying out the method; and the like. 

In one embodiment, a method for gen^ating a directionally or non-directionally 
linked recombinant nucleic acid molecule is performed using a first ds nucleic acid 
molecule with one single stranded overhang, and one topoisomerase site or one 

25 topoisomerase bound thereto. In another embodiment, a third nucldc acid molecule is 
included. In accordance with this aspect of the invention, unique overhang sequences for 
the different ds nucleic acid molecules to be Imked can be prepared having unique 
oveiiiangs such that the nucleic acid molecules can be linked directionally and m any 
desired order. Similarly, the method can be used to link any nmnber of nucleic acid 

30 molecules, including directionally Unking two or more of the numbo* of nucleic acid 
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molecules. In certain embodiments involving a topoisomerase-charged ds nucleic acid 
molecule containing an expression control element, a third (or other) ds nucleotide 
sequence also can comprise one, two or more expression control elements or other 
sequence of interest. 

5 The present invention provides a method for the directional insertion of DNA 

fragments into cloning or expression vectors with the ease and efficiency of 
topoisomerase-mediated cloning. This method has advantages over current cloning 
systems because it decreases the laborious screening process necessary to identify cloned 
inserts in the desired orientation. In one aspect, the method utilizes a linearized 
1 0 expression vector having a single topdisomerase molecule covalently attached at both 
3* ends. A first end of the linearized vector also can contain a S* single stranded overhang, 
and the second end can be either blunt, possess a single 3' thymidine extension for T/A 
cloning, or can itself contain a second 5' smgle stranded overhang sequence. The single 
stranded overhang sequences can be any convenient or desired sequence. 

1 5 Construction of a topoisomerase-charged cloning vector can be accomplished by 

endonuclease digestion of the vector, followed by complementary annealing of synthetic 
oUgonucleotides and site-specific cleavage of the heteroduplex by Vaccinia 
topoisomerase 1. Digestion of a vector with any conq)atible endonuclease creates specific 
sticky ends. Custom oligonucleotides are annealed to these sticky ends, and possess 

20 sequences tiiiat, following topoisomerase I modification, form custom ends of the vector. 
The sequence and length of die singile stranded overhang will vary based on the deshes of 
the user. 

In a preferred use of the single strand sequence topoisomerase-charged ds nucleic 
acid vectors provided by the present invention, the DNA fi:agment to be inserted into tiie 

25 vector is an amplification reaction product such as a PCR product Following PCR 
amplification with custom primers, the product can be directionally inserted into a 
topoisom^ase I charged cloning vector having a single strand sequence on one or both 
ends of the insertion site. The custom primers can be designed such that at least one 
primer of a given primer pair contains an additional sequence at its S* md. The added 

30 sequence is designed to be con:q}lementary to the sequence of the single stranded 
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overhang in the vector. The complementarity between the 5' single stranded overhang in 
the vector and the 5' end of the PGR product mediates the directional insertion of the PGR 
product into the topoisomerase-mediated vector. Specifically, since only one end of the 
vector and one end of the PGR product possess complementary single stranded sequence 
5 regions, the insertion of the product in this instance is directional, and topoisomerase can 
catalyze ligation of the PGR product to the vector. 

BRIEF DESCRIPTION OF THE FIGURES 

Figures 1 A to IE depict a number of ds nucleic acid molecules that can be used to 
practice various aspects of the invention. The circled and boxed areas shown in these 
10 depictions indicate regions which contain sufficient nucleotide sequence complementarity 
to engage in strand invasion with each other. 

Figure 1 A shows two ds nucleic acid molecules (labeled "first'' and **second" 
molecules) which each contain one terminus that is capable of engaging in strand 
invasion with a terminus of the second molecule (see boxes). When a topoisomerase is 

15 used to covalently link (e.g., Ugate) strands of each molecule, tibie 3' recessed strand of the 
terminus of the first molecule will generally be charged with topoisomerase. Further, this 
topoisomerase will generally catalyze the covalent linkage of the 3' recessed strand of the 
terminus of the first molecule to the 5' strand of the second molecule with which it 
engages in strand invasion (i.e., the 5' terminus of the second nucleic acid molecule which 

20 is shown in the box). 

Figure IB shows two ds nucleic acid molecules Gabeled "first" and "second" 
molecules), each of which contains two termini that are enable of engaging in strand 
invasion with tennitii of the other molecule. Further, each of these two nucleic acid 
molecules has a blunt terminus and a terminus with a 5' single stranded overhang (see 
25 circles and boxes). The nucleic acid molecules in this depiction can thus engage in two 
separate strand invasion events which, upon covalent linkage of nucleic acid strands at 
each termini, result in the formation of a single, circular nucleic acid molecule. Govalent 
linkage of the termini can be pCTformed as described, for example, for Figure 1 A, above. 
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Figure IC shows two ds nucleic acid molecules (labeled "iBrst" and "second" 
molecules), each of which contains two termini that are capable of engaging in strand 
invasion with different termini of the other molecule. Further, one of these molecules has 
two blunt termini and the other molecule has 5* single stranded oveihangs on each 
S terminus. The molecules in this depiction can thus engage in two separate strand invasion 
events which, upon covalent linkage of nucldc acid strand at each temiini, result in the 
formation of a circular nucleic acid molecule. Covalent linkage of the termini can be 
performed as described for Figure 1 A, above. 

Figure ID shows three ds nucleic acid molecules (labeled "first", "second" and 
10 "third" molecules). Two of these molecules ("first" and "third" molecules) contain 
5' single stranded overhangs which are capable of engaging in strand invasion with 
different blunt termini of the other molecule ("second" molecule). The molecules in this 
depiction can thus engage in two separate strand invasion events, which result in the 
generation of a linear nucleic acid molecule composed of all three molecules. Covalent 
IS linkage of the termini can be performed as described for Figure lA, above. 

Figure IE shows nucleic acid molecules similar to those set out in Figure ID, 
above, except that one of the nucleic acid molecules ("second" molecule) has 
5* overhangs at both termini and the other two nucleic acid molecules ("first" and 
"second" molecules) each have two blunt termini. 

20 Figure 2 illustrates an aspect of the invention involving strand invasion of a first 

ds nucleic acid molecule with a substantially blunt end containing a topoisomerase at a 
3' terminus of a first strand containing a 5' tail upstream of a topoisomerase recognition 
site; and a second ds nucleic acid molecule having a 3* overhang complementary to the 
5' tail (see Cheng and Shuman, Mol. Cell. Biol. 20:8059-8068, 2000). The boxed areas 

25 shown in these deletions indicate regions which contain sufficient nucleotide sequence 
complCTientarity to engage in strand invasion with each other. 

Figure 3 provides the nxicleotide sequence and the location of restriction 
endonuclease recognition sequences of the Multiple Cloning Site of pUniA^S-His 
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version A (SEQ ID NO: 18), and a plasmid of this 2.3 kb vector. EcoRI cloning site 
is located at nucleotide 471, and SacI cloning site is located at nucleotide 528. 

Figure 4 provides the nucleotide sequence and the location of restriction 
Kidonuclease recognition sequences of the Multiple Cloning Site of pCR 2.1, and a 
5 plasmid map of this vector. Hindm cloning site is located at nucleotide 234, Spel is at 
nucleotide 258 and EcoRI is at nucleotide 283 and nucleotide 299. The vector is 
3906 nucleotides. LacZ alpha fragment: bases 1-587; M13 reverse priming site: 
bases 205-221; Multiple cloning site: bases 234-355; T7 promoter/priming site: bases 
362-381; M13 forward (-20) priming site: bases 389-404; M13 Forward (-40) priming 
10 site: bases 408-424; fl origin: bases 546-960; kanamycin resistance ORF: bases 1294- 
2088; ampicillin resistance ORF: bases 2106-2966; ColEl origin: bases 311 1-3784. The 
illustrated vector represents the pCR®2.1 vector with a PCR product inserted by 
TA Cloning®. Note that the inserted PGR product is flanked on each side by EcoRI sites. 
The arrow indicates the start of transoiption for the T7 RNA polymerase. 

1 5 Figure 5 provides the nucleotide sequence of the Vector pUniATS-His version A 

sequence (SEQ ID N0:16). 

Figure 6 illustrates digestion of pUniA^5-His version A with EcoRI and SacI, and 
the resulting cohesive end sequences. The resulting cohesive end on the left side of the 
figure near the loxP element is the resulting cohesive end post EcoRI digestion. The 
20 resultmg cohesive end on the right side of the figure near the V5 element is the resulting 
cohesive end post SacI digestion. Vector elements including a loxP, V5, and 6XHis 
element as well as a stop codon in fi-ame with these elements are indicated. 

Figure 7 illustrates the addition of adsqjter oUgonucleotides to the digested vector 
in the presmce of DNA ligase. The reaction yields the exhibited linearized, adapted 
25 vector. Adapter sequences are underlined for demarcation. The four ad^tor 
oligonucleotides have the following sequraces: 

TOPO Dl: 5'-AATTGATCCCTTCACCGACATAGTACAG-3' (SEQ ID N0:5) 

TOPO D2: 3'-CTAGGGAAGTGG-5' (SEQ ID N0:6) 

TOPO D3: 3'-GACATGATACAGTTCCCGC-5' (SEQ ID N0:8) 
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TOPO D4: 5'-AAGGGCGAGCT.3^ (SEQ ID N0:7) 

T4 ligation reaction will yield the indicated linearized cloning vector, adapter sequences 
are underlined for demarcation. 

Figure 8 illustrates a topoisomerase cleavage reaction wherein following 
5 topoisomerase cleavage of the scissile strand, a phosphate bond in the non-scissile strand 
keeps the leaving group associated to the vector. In the reaction shown, topoisomerase is 
added to the depicted ds nucleic acid molecule. Topoisomerase binds CCCTT and breaks 
the adjacent phosphodiester bond. Phosphodiester bonds between the adapted vector and 
the aimealing oligo in Oxe non-scissile strand prevent the dissociation of the leaving group 
1 0 i5)on cleavage. In the double stranded DNA model illustrated, X and x represent 
complementary nucleotide bases. 

Figure 9 illustrates a topoisomerase cleavage reaction wherein following 
topoisomerase cleavage of the scissile strand, the lack of a phosphate bond in the 
non-scissile strand allows the leaving group to dissociate from the vector. In the reaction 
15 shown, topoisomerase is added to the depicted ds nucleic acid molecule. Topoisomerase 
binds CCCTT and breaks the adjacent phosphodiester bond. Lack of a phosphodiester 
bond between the adapted vector and the annealing oligo in the non-scissile strand allows 
the dissociation of the leaving groiq) upon cleavage. In the double stranded DNA model 
illustrated, X and x represent complCTientary nucleotide bases. 

20 Figure 1 0 illustrates that addition of an annealing oligonucleotide to the linearized, 

ad25)ted vector in the absence of DNA ligase yields the exhibited linearized, adapted and 
annealed vector. Note that the annealing oligonucleotide is not bound to the vector by a 
phosphate bond, thus, allowing dissociation following topoisomerase mediated cleavage. 
Ad^ter oligonucleotides are demarcated by a single underline, while annealing 

25 oligonucleotides are demarcated by a double underline. There are no phophodiester 
linkages between either of the TOPO D3s and their adjacent oligonucleotide TOPO D2 
and TOPO DS The annealing oligonucleotide has the following sequence and is 
complementary to both TOPO Dl's and TOPO D4's single stranded overhang: TOPO D3 
3'-CTGTATCATGTCAAC-5' (SEQ ID NO:10). 
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Figure 1 1 shows an example of a linearized topoisomerase-charged ds nucleic 
acid cloning vector of the invention. The single stranded overhang corresponds to a 
Kozak transcription sequence. The vector illustrated is a linearized TOPO fl^ cloning 
vector, modified pUni/His version A. 

Figure 12 is the nucleotide sequence of vector pCR 2.1 sequence (SEQ ID 
N0:17). 

Figures 13A and 13B show forms of thepCR2.1® vector. 

Figure 13A shows pCR2. 1® following restriction digestion with EcoRI and 
Hmdin (note the resulting sticky ends). Four adapter oligonucleotides were ligated to the 
linearized vector. TOPO binding sites on the oligonucleotides have the sequence CCCTT 
(undarlined). Sticky end complementary bases are dq)icted in bold The four adapter 
oligonucleotides had the following sequences: 

TOPO H: 5'-AGCTCGCCCTTATTCCGATAGTG-3' (SEQ ID NO: 11); 
TOPO 16: 3'-GCGGGAATAAG (SEQ ID NO: 12); 
TOPO 1: 5'-AATTCGCCCTTATTCCGATAGTG-3' (SEQ ID NO: 13); and 
TOPO 2: 3'-GCGGGAA-5' 

TOPO H and TOPO 1 have 5* ends that complement the HindHI and EcoRI sticky ends, 
respectively. 

Figure 13B shows the adapted version of pCR2.1® following incubation with the 
adapter ohgos in the presence of T4 ligase. 

Figure 14 illustrates the addition of annealing oligonucleotides to the ads^ted 
pCR2.1® vector, followed by the binding of topoisomerase I and the topoisomerase 
mediated cleavage of the double stranded vector. The resulting vector is linear and 
charged with topoisomerase I on both ends. Also, one end of the vector has the custom 
4 bp single stranded sequence, while the other end is blunt In the initial reaction 
illustrated, topoisomwase binds and cleaves the double stranded DNA at the 5' end of the 
covalent binding site located near the ends of pCR2.1®, which contain the bound adapter 
and annealing oligonucleotides. This step is performed in the presence of T4 
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polynucleotide kinase. The annealing oligonucleotides have the following sequences: 
TOPO 3: 3»-TAAGGCTATCACAAC-5» (SEQ ID NO: 15); and 
TOPO 17: 3'-GCTATCAC-5' 

There are no phophodiester bonds formed between TOPO 3 and TOPO 2, or between 
5 TOPO 17 and TOPO 16. The annealmg oligonucleotides are double underlined for 
demarcation. 

Figure 15 illustrates a second example of a linearized topoisomerase-charged 
ds nucleic acid cloning vector of the present invention. In this example the single 
stranded overhang sequence is 3'-TAAG-5\ This vector is the linearized, TOPO charged, 
10 FLAP vector, modified pCR2.1®. 

Figure 16 illustrates PGR amplification of a gene of interest using primers 
designed for directional cloning. The resulting product possesses the necessary single 
stranded overhang for directional cloning using a vector of the invention. The primer 
CACC depicted in the top illustration is homologous to the coding strand of the gene of 
1 5 interest, and has the "FLAP" sequence added to its 5' end. Standard PGR amplification of 
the gene of interest in tiie presence of the appropriate primers, including the CACC 
containing primer, gives the product depicted in the bottom illustration. The product is a 
double stranded gene of interest amplicon with flap sequence at its 5' end. 

Figure 17 illustrates double stranded nucleic acid vectors of the present invention, 
20 including a TOPO FLAP cloning vector, which possesses a single stranded overhang, can 
facilitate insertion of amplified DNA towards proper orientation. Once correcfly inserted, 
topoisomerase will ligate the product to the vector. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides compositions and methods of using strand 
25 invasion to directionally or non-directionally link two or more double stranded 

(ds) nucleic acid molecules. For example, the present invention provides a ds nucleic 
acid molecule having a first end and a second end, wherein the first end contains a first 
5' overhang and a first topoisomerase covalently bound at the 3' terminus, and the second 
end contains a second topoisomerase covalmtly bound at the 3* t^minus and contains a 
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second 5' overhang, a blunt end, a y uridine overhang, or a 3' thymidine overhang, 
wherein the first 5' overhang is different firom the second 5' oveihang. The first 
topoisomerase and second topoisomerase can be the same or dijOferent The first 
5* overhang can have any nucleotide sequence, including, for example, the nucleotide 
5 sequence 5-GGTG. 

Aspects of the present invention modify topoisomerase-mediated cloning so as to 
allow DNA fi-agments, including PCR-generated ORFs, to be directionally inserted into 
cloning vectors, while maintaining the advantages provided by ligation using 
topoisomerase. The system greatly reduces the amount of work involved in screening to 
1 0 identify clones containing inserts in the desired orientation by cabling directional 
cloning efficiencies that are routinely in excess of 90%. The present invention 
streamlines higji throughput gene expression operations and reduces costs associated with 
the screening process, and provides additional advantages. 

A topoisomerase-charged ds nucleic acid molecule of the invention generally has 
IS a single stranded overhang and a first topoisomerase covalently bound at or near a 

terminus of a first end In addition, a topoisomerase-charged ds nucleic acid molecule of 
the invention can include a second topoisomerase covalently bound at or near a terminus 
of the second end. The single stranded overhang can be a 5' overhang, and each 
topoisomerase can be bound at or near one or both 3' termini. Where a topoisomerase is 
20 bound to one, or preferably botii, 3' termini, the second end of the topoisomerase-charged 
ds nucleic acid molecule of ttie present invention typically is a blunt end, a 3' fliymidine 
ovwhang, or a second 5* overhang tiiat is dififerrat &om the first 5' overhang. 

As used herein, reference to a nucleic acid molecule having "a fibrst ead" and "a 
second end" means that flie nucleic acid molecule is linear. The term "single stranded 

25 overhang" or "ovoliang" is used herein to refer to a strand of a ds nucleic acid molecule 
that extends beyond the terminus of the complementary strand of the ds nucleic acid 
molecule. The term "5' overfiang" or "5' overhanging sequence" is used herein to refer to 
a strand of a ds nucleic acid molecule that extends in a S' direction beyond the 3' tennin\is 
of the complementary strand of the ds nucleic acid molecule. The term "3' overfiang" or 

30 "3' overhanging sequence" is used herein to refer to a strand of a ds nucleic acid molecule 
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that extends in a 3' direction beyond the 5* tenninus of the complementary strand of the ds 
nucleic acid molecule. Conveniently, a 5' overhang can be produced as a result of site 
specific cleavage of a ds nucleic acid molecule by a type IB topoisomerase (see 
Examples 1 and 2). Similarly, a 3* overhang can be produced upon cleavage of a 
5 ds nucleic acid molecule by a type lA or type n topoisomerase. 

The 3' overhang and 5' overhang can have any nucleotide sequence and can be any 
length ( e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, etc. nucleotides), but will generally be at least two 
nucleotides. The overhanging sequences can be selected such that the allow ligation of a 
predetermined end of one ds nucleic acid molecule to a predetermined end of a second 

10 nucleic acid molecule according to a method of the inventionu Where the double stranded 
nucleic acid molecules are directionally linked, the 3' or 5* overhangs are generally not 
palindromic because ds nucleic acid molecules having palindromic overhangs can 
associate with each other, thus reducing the yield of a directionally linked recombinant 
nucleic acid molecule comprising two or more ds nucleic acid molecules in a 

1 5 predetermined orientation. The overhang can conq)iise, for example, a nucleotide 
sequence of a transcriptional or translational regulatory element such as a promoter, 
Kozak sequence, start codon, or the like, or a compl^ent of such a nucleotide sequence." 

A 3' overhang or 5' overhang can include virtually any nucleotide or nucleotide 
analog or modified nucleotide that can hybridize with a complementary nucleotide 

20 residue, provided that at least a portion of the nucleotide sequence of the overhang can 
hybridize with the complementary sequence. Thus, the nucleosides in a overhang can 
include naturally occurring nucleotides such as purines (guanosine (G) or adenosuie (A)), 
or pyrimidines (thymidine (T), uridine (U) or cytidme (C)). Additionally, ttie overhang 
can include substitutes for the nucleosides, for example, a nucleoside such as inosine, or a 

25 modified form of a nucleoside such as methyl guanosine, or a 5-halogenated pyrimidine 
nucleoside (e.g., 5-bromodeoxy xuidine or 5-methyl deoxycytidine). If desired, the 
overhang can have a relatively high GC contait, for example, the overhang can have a 
greater than 50% GC content, such as 66% GC or 75% GC or 80% GC or 100% GC 
content. In one embodiment, the ovahang has the sequence 5-GGTG-3'. 
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A 5* or 3' overhang of a first nucleic acid molecule, for example, can include one 
or two or a few nucleotide residues, for example, at the free terminus of the overhang, for 
which a complementary nucleotide residue is not present in the complementary sequence 
at or near the substantially blunt end of the second (or other) ds nucleic acid molecule to 
5 which it is being linked. Neverflieless, the overhang at the end of flie first nucleic acid 
molecule can selectively hybridize to the complementary sequence of the second nucleic 
acid molecule due to the other nucleotide residues in the overhang. For example, where a 
5' overhang consists of six nucleotides, the 5 -most one or two nucleotides need not be 
complementary to the corresponding nucleotides in the complementary nucleotide 

10 sequence in the second nucleic acid molecule, but selective hybridization nevertheless can 
occur due to the complementarity of the remaining four nucleotide residues. The number 
or specific positions of non-complementary nucleotide residues that can be in an 
overhang (or in the "complementary" sequence in the second nucleic acid molecule) 
without substantially reducing or inhibiting hybridization specificity can be determined 

1 5 using routine hybridization methods. 

The nucleotide residues of the overhang can mclude locked nucleic acid ("LNA") 
analogues (Proligo; Boulder CO). LNA monomers are bicyclic compounds that are 
structurally similar to ribonucleosides. The term "Locked Nucleic Add" was coined to 
emphasize that the fiuanose ring conformation is restricted in an LNA by a methylene 

20 linker that connects the 2'-0 position to the 4*-C position. As used herein, all nucleic acid 
molecules containing one or more LNA modifications are referred to as LNA molecules. 
LNA oUgomers obey Watson-Crick base pairing rules and hybridize to complementary 
oligonucleotides. LNA can provide vastly improved hybridization, stability, and 
increased thermal stability performance when compared to DNA and other nucleic acid 

25 derivatives in a number of situations (Koshkin et al., Tetrahedron 54:3607-30, 1 998; 
Koshkin et al., J. Am. Chem. Soc. 120:13252-53, 1998; Wahlestedt et al., Proc.Natl. 
Acad. Sci., USA 97:5633-38, 2000). 



30 



It should be recognized that reference to a first end or a second end of a ds nucleic 
acid molecule is not intended to imply any particular orientation of the nucleic acid 
molecule, and is not intended to imply a relative importance of the ends with respect to 
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each other. Where a nucleic acid molecule having a first end and second end is a double 
stranded nucleic acid molecule, each end contains a 5' terminus and a 3* terminus. Thus, 
reference is made herein, for example, to a nucleic acid molecule containing a 
topoisomerase recognition site at a 3* terminus and a hydroxyl group at the 5* terminus of 
5 the same end, which can be the first end or the second end. 

Topoisomerase when bound to a nucleic acid molecule, will generally be bound 
"at or near" a terminus of a ds nucleic acid molecule. The term "at or near" when used 
with respect to a topoisomerase, means that the topoisomCTase is covalently bound to one 
strand of a ds nucleic acid molecule such that it can ligate the terminus of the strand to 

10 which it is bound, to a second nucleic acid molecule containing a firee 5' terminal 

hydroxyl group. Generally, the topoisomerase is "at or near" an end by virtue of being 
covalently bound to one taminus of the end. For example, where the topoisomerase is a 
type IB topoisomerase such as a Vaccinia topoisomerase, the topoisomerase is bound at 
the 3* terminus of an end of a ds nucleic acid molecule. However, an end having a 

1 S topoisomerase covalently bound to a terminus of the end also can contain a single 
stranded overhang sequence in the con^lementary strand, thus extending beyond the 
terminus to which the topoisomerase is bound. Such a topoisomerase is an example of a 
topoisomerase near an end of the ds nucleic acid molecule. 

As used h^in, the term "isolated," when used in reference to a molecule, means 
20 that the molecule is in a form other than that in which it exists in nature. In general, an 
isolated nucleic acid molecule, for example, can be any nucleic acid molecule that is not 
part of a g^ome in a cell, or is separated physically firom a cell that normally contains the 
nucleic acid molecule. It should be recognized that various compositions of the invention 
comprise a mixture of isolated ds nucleic acid molecules. As such, it will be understood 
25 that the term "isolated" only is used in respect to the isolation of the molecule fiom its 
natural state, but does not indicate that fhe molecule is an only constituent. 

Topoisomerases are a class of enzymes that modify the topological state of DNA 
via the breakage and rejoining of DNA strands (Shuman et aL, U.S. Pat. No. 5,766,891, 
incorporated h^ein by reference). Topoisomerases are categorized as type I, including 
30 type lA and type IB topoisomerases, which cleave a single strand of a double stranded 
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nucleic acid molecule, and type n topoisomerases (gyrases), which cleave both strands of 
a nucleic acid molecule. As disclosed herein, type I and type II topoisomerases, as well 
as catalytic domains and mutant forms thereof, are useful for generating directionally 
linked recombinant nucleic acid molecules according to a method of flie invention. 
5 Type II topoisomerases have not generally been used for generating recombinant nucleic 
acid molecules or cloning procedures, whereas type IB topoisomerases, are used in a 
variety of procedures. 

Type lA and IB topoisomerases cleave one strand of a ds nucleic acid molecule. 
Cleavage of a ds nucleic acid molecule by type lA topoisomerases generates a 
5* phosphate and a 3' hydroxyl at the cleavage site, with the type lA topoisomerase 
covalently binding to the 5' teraiinus of a cleaved strand. In comparison, cleavage of a 
ds nucleic acid molecule by type IB topoisomerases generates a 3' phosphate and a 
5' hydroxyl at the cleavage site, with the type IB topoisomerase covalently binding to the 
3' terminus of a cleaved strand. Type lA topoisomerases mclude, for example, E. coli 
topoisomerase I and topoisomerase m, eukaryotic topoisomerase n, and archeal reverse 
gyrase (see Berger, Bincbi'm tti onhvs. Acta 1400:3-18, 1998, which is incorporated 
herein by reference). 

Type IB topoisomerases include the nuclear type I topoisomerases preset in all 
eukaryotic cells and those encoded by Vaccinia and other cellular poxviruses (see Cheng 
20 et al., CeU 92:841-850, 1998, which is incorporated herein by reference). The eukaryotic 
type IB topoisomerases are exemplified by fliose expressed in yeast, Drosophila and 
mammalian cells, including human cells (see Caron and Wang, Adv. Pharmacol. 
29B,:271.297, 1994; Gupta et al., Binchim RmphyR Acta 1262:1-14, 1995, each of 
which is incorporated herein by reference; see, also, Berger, supra, 1998). \^ type IB 
25 topoisomerases are exemplified by those produced by the vertebrate poxviruses ( Vaccinia, 
Shope fibroma virus, ORF virus, fowlpox virus, and molluscum contagiosum virus), and 
the insect poxvirus (Amsacta moorei entomopoxvirus) (see Shuman, BiochiTn Biophvs. 
Acta 1400:321-337, 1998; Petersen et al.. Virology 230:197-206, 1997; Shuman and 
Prescott, Proc. Natl. Acad. Sci., USA 84:7478-7482, 1987; Shuman, J. BioLChem. 
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269:32678^32684, 1994; U.S. Pat. No. 5,766,891; PCT/US95/16099; PCT/US98/12372, 
each of which is incorporated herein by reference; see, also, Cheng et al., supra^ 1998). 

Type n topoisomerases include, for example, bacterial gyrase, bacterial DNA 
topoisomerase IV, eukaryotic DNA topoisomerase II, and T-even phage encoded DNA 
5 topoisomerases (Roca and Wang, Cell 71 :833-840, 1992; Wang, J. BioL Chem. 266:6659- 
6662, 1991, each of which is incorporated herein by reference; Berger, supra, 1998). 
Like the type IB topoisomerases, the type n topoisomerases have both cleaving and 
Ugating activities. In addition, like type IB topoisomerase, substrate ds nucleic acid 
molecules can be prepared such that the type n topoisomerase can form a covalent 

10 linkage to one strand at a cleavage site. For example, calf thymus type n topoisomerase 
can cleave a substrate ds nucleic acid molecule containing a 5' recessed topoisomerase 
recognition site positioned three nucleotides fiom the 5' end, resulting in dissociation of 
the three nucleic acid molecule 5' to the cleavage site and covalent binding of the 
topoisomerase to the 5* terminus of the ds nucleic acid molecule (Andersen et al., supra^ 

15 1991). Furthermore, upon contacting such a type n topoisomerase-charged ds nucleic 
acid molecule wifli a second nucleic acid molecule containing a 3' hydroxyl group, the 
type n topoisom^^e can ligate the sequences together, and then is released from the 
recombinant nucleic acid molecule. As such, type n topoisomerases also are useful for 
performing methods of the invention. 

Structural analysis of topoisomerases indicates that the members of each particular 
topoisomerase families, including type lA, type IB and type n topoisomerases, share 
common structural features with other members of the femily (Beiger, supra, 1998). Tn 
addition, sequence analysis of various type IB topoisomerases indicates that the stmctures 
are highly conserved, particularly in the catalytic domain (Shuman, 5upra, 1998; Cheng et 
al., supra, 1998; PetersCT et al., supra 1997). For example, a domain comprising amino 
acids 81 to 314 of the 314 amino acid Vaccinia topoisomerase shares substantial 
homology with other type IB topoisomerases, and the isolated domain has essentially the 
same activity as the fiill length topoisomerase, although the isolated domain has a slower 
turnover rate and lower binding affinity to the recognition site (see Shuman, supra, 1998; 
Chsfag et al., supra, 1998). In addition, a mutant Vaccinia topoisomerase, which is 
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mutated in the amino terminal domain (at ammo acid residues 70 and 72) displays 
identical properties as the full length topoisomerase (Cheng et al., supra, 1998). In fact, 
mutation analysis of Vaccinia type IB topoisomerase reveals a large number of amino acid 
residues that can be mutated without affecting the activity of the topoisomerase, and has 
5 identified several amino acids that are required for activity (Shuman, supra^ 1998). In 
view of the high homology shared among the Vaccinia topoisomerase catalytic domain 
and the other type IB topoisomerases, and the detailed mutation analysis of Vaccinia 
topoisomerase, it will be recognized that isolated catalytic domains of the type IB 
topoisomerases and type IB topoisomerases having various amino acid mutations can be 
10 used in the methods of the mvration and thus are considered to be topoisomerases for 
purposes of the present invention. 

The various topoisomerases exhibit a range of sequence specificity. For example, 
type n topoisomerases can bind to a variety of sequences, but cleave at a highly specific 
recognition site (see Andersen et al., L Biol. Chem, 266:9203-9210, 1991, which is 

15 incorporated herein by reference.). In comparison, the type IB topoisomerases mclude 
site specific topoisomerases, v^4iich bind to and cleave a specific nucleotide sequence 
("topoisomerase recognition site"). Upon cleavage of a ds nucleic acid molecule by a 
topoisomerase, for example, a type IB topoisomerase, the energy of the phosphodiester 
bond is conserved via the formation of a phosphotyrosyl linkage between a specific 

20 tyrosine residue in flie topoisomerase and the 3* nucleotide of the topoisomerase 

recognition site. Where the topoisomerase cleavage site is near the 3' terminus of the 
nucleic acid molecule, the downstream sequence (3' to the cleavage site) can dissociate, 
leaving a nucleic acid molecule having the topoisomerase covalently bound to the newly 
generated 3' end (see FIG 9). 

25 The covalenfiy bound topoisomerase also can catalyze the reverse reaction, for 

example, covalmt linkage of the 3' nucleotide of the recognition sequence, to which a 
type IB topoisomerase is linked through the phosphotyrosyl bond, and a nucleic acid 
molecule containing a fi-ee 5' hydroxyl group. As such, methods have be^i developed for 
using a type IB topoisomerase to produce recombinant nucleic acid molecules. As such, 

30 cloning vectors containing a bound type IB topoisomerase have been developed and are 
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cx)mmercially avaUable (Invitrogen Corp., Carlsbad CA). Such cloning vectors, when 
linearized, contain a covalently bound type IB topoisomerase at each 3' end 
("topoisomerase-charged"). Nucleic acid molecules such as those comprising a cDNA 
library, or restriction firagments, or sheared genomic DNA sequences that are to be cloned 
5 into such a vector are treated, for example, with a phosphatase to produce 5' hydroxyl 
termini, then are added to the linearized vector under conditions that allow the 
topoisomerase to ligate the nucleic acid molecules at the 5' tenninus containing the 
hydroxyl group and the 3' terminus containing the covalently bound topoisomerase. A 
nucleic acid molecule such as a PGR amplification product, which is produced containing 
10 S' hydroxyl ends, can be cloned into a topoisomerase-charged vector in a rs^id joining 
reaction (approxhnately 5 min at room temperature). The rapid joining and broad 
temperature range inherent to the topoisomerase joining reaction makes the use of 
topoisomerase-charged vectors ideal for high throughput applications, which generally 
are performed using automated systems. 

IS Vaccinia virus encodes a 314 amino-acid type I topoisomerase enzyme capable of 

site-specific single-strand nicking of double stranded DNA, as well as 5* hydroxyl driven 
religation. Site-specific type I topoisomerases include, but are not limited to, viral 
topoisomerases such as pox virus topoisomerase. Examples of pox virus topoisomerases 
include Shope fibroma virus and ORF virus. Other site-specific topoisomerases are well 

20 known to those skilled in the art and can be used to practice this invention. 

Vaccinia topoisomerase binds to duplex DNA and cleaves the phosphodiester 
backbone of one strand while exhibiting a higjb level of sequence specificity. Cleavage 
occurs at a consensus pentapyrimidine element S'-(C/T)CCTT4', or related sequences in 
the scissile strand In one embodiment the scissile bond is situated in the range of 2 to 

25 12 bp from the 3' end of the duplex DNA. In another embodiment cleavable complex 
formation by Vaccinia topoisomerase requires six duplex nucleotides upstream and two 
nucleotides downstream of the cleavage site. Examples of Vaccinia topoisomerase 
cleavable sequences include, but are not limited to, +67-6 duplex GCCCTTA TTCCC 
(SEQ ID N0:1), +8M duplex TCGCCCTTATTC (SEQ ID NO:2), +10/-2 duplex 

30 TGTCGCeCTTAT (SEQ ID N0:3), +1 1/-1 duplex GTGTCG^CTTA (SEQ ID N0:4). 
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Examples of ottxer site-specific type I topoisomerases are well known in the art. 
These enzymes are encoded by many organisms mcluding, but not limited to 
Saccharomyces cerevisiae, Saccharomyces pombe and Tetrahymena, however, the 
topoisomerase I enzymes of these species have less specificity for a consensus sequence 
5 than does flie Vaccinia topoisomerase. (Lynn et al., Proc, Natl. Acad. Sci. USA 86: 3559- 
3563, 1989; Eng et al., J. Biol. Chem. 264: 13373-13376, 1989; Busk et al.. Nature 327: 
638-640, 1987). 

The compositions and methods of the invention are exempUfied generally herein 
with reference to the use of type IB topoisomerase such as the Vaccinia topoisom^^e. 

1 0 However, it will be recognized that the methods also can be performed using other 
topoisomerases merely by adjusting flie conq)onaits accordingly. For example, as 
described in greater detail below, methods are disclosed for incorporating a type IB 
topoisomerase recognition site at one or both 3' termini of a ds nucleic acid molecule. 
Accordingly, in view of the present disclosure, the artisan will recognize that a 

1 5 topoisomerase recognition site for a type lA or type n topoisomerase similarly can be 
incorporated into a ds nucleic acid molecule. 

A topoisomerase-charged ds nucleic acid molecule that contains a 5' overhang on 
a first end generally contains a topoisomerase covalently bound to the 3' terminus of the 
first end. A ds nucleic acid containing a 5' overhang and first topoisomerase at a first end, 

20 also can contain a second topoisomerase covalently bound to the second end. The 

topoisomerase covalently bound to fiie first end can be the same as or di£ferent fiom the 
topoisomerase covalently bound to the second end. Thus, a Vaccinia topoisomerase can 
be covalently boimd to a first end and another poxvirus or eukaryotic nuclear type IB 
topoisomCTase can be bound to the second end. GeneraUy, where the topoisomerases at 

25 each end are diflFerent, they are members of the same general family, for example, type LA 
or type IB or type n topoisomerase. 

In one embodiment, a topoisomerase-charged double stranded nucleic acid 
molecule of the invention is a vector, which can be a cloning vector or an expression 
vector. Hie vector can include elements such as a bacterial origin of rq>lication, a 
30 eukaryotic origin of replication, antibiotic resistance genes, and the like, and can finlher 
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include topoisomerase recognition sites or topoisomerase-charged ends or a combination 
thereof. Such vectors of the invention can conveniently be packaged into kits as 
disclosed herein. A vector of the invention can be a plasmid vector, a cosmid vector, an 
artificial chromosome (e.g., a bacterial artificial chromosome, a yeast artificial 
S chromosome, a mammalian artificial chromosome, etc.) or a viral vector such as a 
bacteriophage, baculovirus, retrovirus, l^tivirus, adenovirus, Vaccinia virus, semliki 
forest virus and adeno-associated virus vector, all of which are well known and can be 
purchased firom commercial sources (Promega, Madison WI; Stratagene, La Jolla CA; 
GIBCO/BKL, Gaithersburg MD). Viral expression vectors can be particularly useful 
10 where a method of the invention is practiced for the purpose of generating a directionally 
linked recombinant nucleic acid molecule that is to be introduced into a cell, particularly 
a cell in a subject Viral vectors provide the advantage that they can infect host cells with 
relatively high efl&ciency and can infect specific cell types or can be modified to infect 
particular cells in a host. 

IS Viral vectors have been developed for use in particular host systems and include, 

for example, baculovirus vectors, which infect insect cells; retroviral vectors, other 
lentivirus vectors such as those based on the human immunodeficiency virus (HIV), 
adenovirus vectors, adeno-associated virus (AAV) vectors, herpesvirus vectors. Vaccinia 
virus vectors, and the like, which infect manmialian cells (see Miller and Rosman, 

20 BioTechniqucs 7:980-990, 1992; Anderson et al.. Nature 392:25-30 SuppL, 1998; Verma 
and Somia, Nature 389:239-242, 1997; Wilson, New Engl, J.Med. 334:1 185-1 187, 1996, 
each of which is incorporated herein by reference). For example, a viral vector based on 
an HIV can be used to infect T cells, a viral vector based on an adeaovirus can be used, 
for example, to infect respiratory epithelial cells, and a viral vector based on a herpesvirus 

25 can be used to infect neuronal cells. Other vectors, such as AAV vectors can have greater 
host cell range and, therefore, can be used to infect various cell types, although viral or 
non-viral vectors also can be modified witii specific receptors or ligands to alter target 
specificity througji receptor mediated events. 

A linearized vector of the invention, which is topoisomerase-charged or contains 
30 topoisomerase recognition sites can be generated using methods as disclosed herein or 
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otherwise known in the art. For example, a circular vector can be linearized, and 
modified by ligating or hybridizing one or more oligonucleotides, to generate a 
topoisomerase recognition site, or a cleavage product thereof, and a target 5' sequence or 
5* overhang, at one or both ends (see Examples 1 and 2). The vector also can contain, for 
5 example, expression control elements required for replication in a prokaryotic host cell, a 
eukaryotic host cell, or both, and can contain a nucleotide sequence encoding a 
polypeptide that confers antibiotic resistance or the like, or such elements can be 
introduced into the vector using the methods of the invention. Furthermore, the vector 
can contain one, two, or more site ^ecific integration recognition site such as an att site 
10 or lox site. The incorporation, for example, of atfB or atfP sequences into an isolated 
nucleic acid molecule of the preset invention allows for tiie convenient manipulation of 
the nucleic acid molecule using the GATEWAY™ Cloning System (Invitrogen Corp., La 
JollaCA). 

The invention provides a modified cloning vector having an overhanging single 
1 S stranded piece of DNA charged with topoisomerase. The modified vector allows the 
directional insertion of linear ds nucleic acid molecules, for example PGR amplified, or 
otherwise suitable ORFs, for subsequent expression, and takes advantage of 
topoisom^:ase cloning efGciency. As used herein, the term donor signifies molecules 
such as a duplex DNA which contains a S'-CCCTT cleavage site near the 3* end, and the 
20 term acceptor signifies a duplex DNA which contains a 5 -OH terminus. Once covalently 
activated by topoisomerase tihie donor will be transferred to those acceptors to which it has 
single strand sequence complementation. 

According to the present invention, in particular embodiments topoisomerase- 
modified vectors are further ad^ted to contain at least one S' single-stranded overhang 
25 sequence to facilitate the directional insertion of DNA segments. A nucleic acid molecule 
to be cloned into such a vector can be a PGR product constituting an ORF, which can be 
expressed firom the resultant recombinant vector. The primers used for amplifying the 
ORF are designed such that at least one primer of the primer pair contains an additional 
sequence at its 5* end. This sequence is designed to be complementary to the sequence of 
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the 5' single-stranded overhang present in the topoisomerase-modified vector of the 
present invention. 

The present invention generally provides methods for generating a directionally or 
non-directionally linked recombinant nucleic acid molecule, by using a strand invasion 
5 event and a ligation event to link, in a directional or non-directional manner, a first 
nucleic acid molecule and at least a second nucleic acid molecule. As used herein, the 
term "strand invasion" refers to the displacement of one strand of a first double stranded 
nucleic acid molecule by a single stranded portion of a second nucleic acid molecule, 
wherein the single strand has nucleotide sequence that is substantially identical to the 
10 displaced strand and can selectively hybridize to the strand complementary to the 
displaced strand. 

A method for generating a directionally or non-directionally linked recombinant 
nucleic acid molecule can be performed, for example, by contacting a first ds nucleic acid 
moleciile having a first overhang on a first strand (e.g., a 3' or 5' strand) at a first end; and 

15 a second ds nucleic acid molecule having a first substantially blunt end and a second end, 
wherein a nucleotide sequence that is complementary to the first overhang of the fixst aid 
of the first nucleic acid molecule, is present at or near the first substantially blunt end. 
The method is performed under conditions jsuch that the first overhang can selectively 
hybridize to the complementary nucleotide sequence of the first substantially blunt end of 

20 the second ds nucleic acid molecule, and the first end of the first ds nucleic acid molecule 
and the first end of the second ds nucleic acid molecule can be linked. The fir^t overhang 
can be a 3' oveiiiang or a 5' overhang. The invention fiiriher provides precursor nucleic 
acid molecules v/inch can be used to prepare molecules suitable for use in the method 
described above. The invention also provides nucleic acid molecules prepared by tiie 

25 above metiiod. 

Figure 1 illustrates examples of ways in which the methods of the invention can 
be used to generate a covalently linked recombinant nucleic acid molecule. The boxes 
and circles in Figure 1 are used to depict regions of sequ^ce complementarity such that a 
strand displacement event can occur. The other end of tiie ds nucldc acid molecules, 
30 which do not necessarily (but can) involve a strand displacement event, can have any 



wo 02/16594 



PCTAJSOl/26294 



31 

structure, including, can be substantially blunt or can include a 3' or 5' overhang. Other 
combinations of blunt ends and/or overhangs on the first, second, third, etc. ds nucleic 
acid molecules can be linked according to the methods of the invention and will be 
evident to those in the art based, in part, on the examples provided in Figure 1 . 

5 As shown in Figure 1 A, a method for generating a directionally or 

non-directionally linked recombinant nucleic acid molecule can be performed, for 
example, by contacting a first ds nucleic acid molecule, which has a first overhang on a 
first strand at a first end; a second ds nucleic acid molecule, which has a first substantially 
blunt end and a second end, wherein the first substantially blxmt end has a nucleotide 

1 0 sequence that is complementary to the first overhang of the first end of the first nucleic 
acid molecule; and a reagent for ligating the nucleic acid molecules (e.g., a reagent 
comprising a topoisomerase, a ligase, or a recombinase). The method is performed under 
conditions such fliat flie first overhang can selectively hybridize to the con^lementary 
nucleotide sequ^ce of the first substantially blunt end of the second ds nucleic acid 

I S molecule. Furthermore, the method is performed in the presence of a reagent that can 
ligate a 5' strand of one nucleic acid molecule to a 3' strand of a second nucleic acid 
molecule, and under conditions such tiiat the 3' terminus of the first end of the first ds 
nucleic acid molecule and the S' terminus of the first end of the second ds nucleic acid 
molecule are linked. In various modifications of the method described above, as well as 

20 in other methods described above, the first overhang can be a 3' overhang or a 
5* overhang. 

As shown in Figure IB, a method for generating a linked, for example 
directionally linked, recombinant nucleic acid molecule can be performed by contacting a 
first ds nucleic acid molecule with a first ov^hang on a first strand at a first end and a 

25 second substantially blunt end; and a second ds nucleic acid molecule, which has a first 
substantially blunt end and a second end which has a second overhang, wherein the first 
substantially blunt end of the second ds nucleic acid molecule has a nucleotide sequence 
that is complwnentary to the first overhang of the first end of the first nucleic acid 
molecule, and wh^ein the second substantially blunt end of the first nucleic acid 

30 molecule has a nucleotide sequence that is complementary to the second overhang of the 
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second end of the second ds nucleic acid molecule. The method is perfonned under 
conditions such that the first overhang can selectively hybridize to the complementary 
nucleotide sequence of the first substantially blunt end of the second ds nucleic acid 
molecule, and wherein the second overhang can selectively hybridize to the 
S complementary nucleotide sequence of the second substantially blunt end of the first ds 
nucleic acid molecule. Furthermore, the method may be performed in die presence of a 
reagent which can catalyze the ligation of a 3' strand of one nucleic acid molecule to a 
S' strand of another nucleic acid molecule, and under conditions such that the 3' terminus 
of the first end of the first ds nucleic acid molecule is linked to the 5* terminus of the first 
10 end of the second ds nucleic acid molecule, and the 3' temiinus of the second substantially 
blunt end of the first ds nucleic acid molecule is linked to the 5' terminus offhe second 
end of the second ds nucleic acid molecule. In various modifications of the method 
described above, one or both the first and second overhangs can be 3* oveiiiangs or 
5' overhangs. The ds nucleic add molecules can thus engage in two separate strand 
15 invasion events which, upon covalent linkage of nucleic acid strands at each termini, 
result in the formation of a circular recombinant nucleic acid molecule. 

As shown in Figure IC, a method for generatmg a linked, for example 
directionally linked, recombinant nucleic acid molecule can be performed, for example, 
by contacting a first ds nucleic acid molecule with a first overhang on a first strand at a 
first end and a second end having a second overhang; and a second ds nucleic acid 
molecule, which has a first substantially blunt end and a second substantially blunt end, 
wherein the first substantially blunt end of the second ds nucleic acid molecule has a 
nucleotide sequence that is complementary to the first overhang of the first ead of the first 
nucleic acid molecule, and wherein the second substantially blunt end of the second 
nucleic acid molecule has a nucleotide sequence that is complementary to the second 
overhang of the second md of the first ds nucleic acid molecule. The method may be 
performed under conditions such that the first overhang can selectively hybridize to the 
complementary nucleotide sequrace of the first substantially blunt end of the second ds 
nucleic acid molecule, and wherein the second overhang can selectively hybridize to the 
conq)lementary nucleotide sequence of the second substantially blunt end of the second 
ds nucleic acid molecule. Furthermore, the method is perfonned under conditions such 
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that the first end of tiie first ds nucleic acid molecule is linked to the first end of the 
second ds nucleic acid molecule, and the second end of the first ds nucleic acid molecule 
is linked to the second substantially blunt end of the second ds nucleic acid molecule. In 
various modifications of &e method described above, one or both the first and second 
5 oveifaangs can be 3' overhangs or 5* overhangs. The ds nucleic acid molecules can thus 
engage in two separate strand invasion events, which, upon covalent linkage of nucleic 
acid strands at each termini, result in the formation of a circular recombinant nucleic acid 
molecule. 

As shown in Figure ID, a method for generating a linked, for example 

10 directionally linked, recombinant nucleic acid molecule can be performed, for example, 
by contacting a first ds nucleic acid molecule, which has a first overhang on a first strand 
at a first end; a second ds nucleic acid molecule, which has a first substantially blunt end 
and a second substantially blunt end; and a third ds nucleic acid molecule which has a 
second overhang on a first strand at a first en^ wherein the first substantially blunt end of 

IS the second ds nucleic acid molecule has a nucleotide sequence that is compl^entary to 
the first overhang, and the second substantially blunt end of the second ds nucleic acid 
molecule has a nucleotide sequence that is complementary to the second overhang. The 
method is performed under conditions such that the first overhang can selectively 
hybridize to the complementary nucleotide sequence of the first substantially blunt end of 

20 the second ds nucleic acid molecule, and wherein the second overhang can selectively 

hybridize to the complementary nucleotide sequence of the second substantially blunt end 
of the second ds nucleic acid molecule. Furthermore, the method may be performed 
under conditions such that the first end of the first ds nucleic acid molecule is linked to 
the first end of the second ds nucleic acid molecule, and the first end of the third ds 

25 . nucleicacidmoleculeislinkedto the second substantially blunt end of the second ds 
nucleic acid molecule, hi various modifications of tiie method described above, one or 
both the first and second oveifaangs can be 3' overhangs or 5' overhangs. 



30 



As shown in Figure IE, a method for generating a linked, for example 
directionally linked, recombinant nucleic acid molecule can be performed, for example, 
by contacting a first ds nucleic acid molecule, which has a first substantially blunt end; a 
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second ds nucleic acid molecule which has a first overhang on a first strand at a first end 
and a second overhang on a second strand at a second end; and third ds nucleic acid 
molecule, which has a second substantially blunt end, wherein the first substantially blunt 
end of the first ds nucleic acid molecule has a nucleotide sequence that is complementary 
5 to the first overhang of the first end of the second nucleic acid molecule, and wherein the 
second substantially blunt end has a nucleotide sequ^ce that is complementary to the 
second overhang of tiie second end of the second ds nucleic acid molecule. The method 
is performed under conditions such that the first overhang can selectively hybridize to tibe 
complementary nucleotide sequence of the first substantially blunt end of the first ds 

1 0 nucleic acid molecule, and wherein the second ov^hang can selectively hybridize to the 
complementary nucleotide sequence of ttie second substantially blimt end. Furthermore, 
the method may be performed under conditions such that the first substantially blunt end 
of the first ds nucleic acid molecule is linked to the first end of the second ds nucleic acid 
molecule, and the second substantially blunt end, located on the third ds nucleic acid 

1 S molecule, is linked to the second end of tiie second ds nucleic acid molecule. In various 
modifications of the method described above, one or both the first and second overhangs 
can be 3* overhangs or 5* overhangs. 

A method for generating a directionally or non-directionally linked recombinant 
nucleic acid molecule can be performed, for example, by contacting a fiirst ds nucleic acid 

20 molecule, which has a first topoisomerase covalently bound at or near a first substantially 
blunt end; and a second ds nucleic acid molecule, which has a first 3' overhang on a first 
strand at a first end, wherein the first substantially blunt end of the first ds nucleic acid 
molecule has a nucleotide sequence that is complraientary to the first 3* overhang (see 
Figure 2). The method is performed under conditions such that the first 3* overhang can 

25 selectively hybridize to the complementary nucleotide sequence of the first substantially 
blunt end of the first ds nucleic acid molecule. Furthermore, the method is performed 
such that topoisomerase can covalently link the 3' tOTninus of the first end of the first ds 
nucleic acid molecule to the S' temiinus of the first end of the second ds nucleic acid 
molecule (Oieng and Shuman, Mol. Cell, Biol. 20:8059-8068, 2000, which is 

30 incorporated herein by reference in its entirety). 
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The ability of a topoisomerase covalently boxrnd at or near a first substantially 
blunt end of a first ds nucleic acid molecule to covalently linked to a second ds nucleic 
acid molecule with a 3' overhang (Figure 2) illuminates a previously unappreciated 
conformational flexibility of covalently bound topoisomerase with respect to its DNA 
S contacts on the 5* side of the scissile phosphodiester. In catalyzing the relaxation of 
supercoiled DNA, covalently bound topoisomerase type IB releases the downstream 
duplex and permits rotation of the diqplex around the phosphodiester bond opposite the 
scissile phosphate before resealing the backbone. 

A method of the present invention can be perforaied in a manner that obviates the 
10 need to perform an additional reaction to repair a remaining nick after the homology 
dependent ligation, by substantially replacing one strand of a nucleic acid molecule (see 
Figure 2). For exan^le, the methods can be performed such that the overhang sequence 
of one nucleic acid molecule extrads the entire length of and, upon strand invasion, 
replaces a strand of the other ds nucleic acid molecule. Thus, in this embodiment, there is 
IS no nick in the strand that was not Ugated by the topoisomerase. 

The termini of the ds nucleic acid molecules that are linked using the methods of 
the ciurent invention can be covalently linked, using any reagent useful for linking a 
S' terminus of a one nucleic acid molecule to a 3* temiinus of a second nucleic add 
molecule. Thus, the reagent for covalently linking the termini can be, for exanq)le, a 

20 topoisomerase, including a type lA, type IB or type n topoisomerase; a ligase such as 
T4 DNA ligase; a recombinase, including FLIP recombinase, Int integrase, or ere 
recombinase; or another INT £miily member (see, for example, NucL Acids Res. 26:391- 
406, 1998). Furthermore, where one nick remains after a ligation of one strand, the nick 
can be closed by an in vivo ligation, for example by introduction into a cell, such as 

25 E. colly of the nicked ds nucleic acid molecule. In certain preferred embodiments, the 
linking of the two ends involved in a strand displacement event mvolves topoisomerase 
ligation. Furthermore, in a mefliod as disclosed for linking a first ds nucleic acid 
molecule and a second ds nucleic acid molecule, a third nucleic acid molecule also can be 
linked to an end of the first or second nucleic acid molecule that is not involved in a 

30 strand displacement event. 
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Methods of the present invention can be performed to link a first ds nucleic acid 
molecule to at least a second ds nucleic acid molecule in a non-directional or, preferably, 
a directional manner. However, the methods can be used to non-directionaUy linW the first 
ds nucleic acid molecule and the second ds nucleic acid molecule in embodiments where 
5 complemmtarity exists between nucleotide sequences at or near a temiinus of both ends 
of one of the ds nucleic acid molecules and nucleotide sequences at or near a temiinus of 
at least one end of the other ds nucleic acid molecule. Such complementarity between 
nucleotide sequences at or near a temiinus of both ends of one ds nucleic acid molecule 
and at least one strand of the second ds nucleic acid molecule can be achieved, for 
10 example, by including identical nucleotide sequences at the same terminus (i.e., 5' or S*) 
of both ends of a ds nucleic acid molecule. This can be accomplished, for example, by 
designing target sequences that can be cleaved with the same restriction enzyme and 
which contain the same nucleotide sequence. 

The present invmtion also relates to methods of directionally or non-directionally 
15 linking a first and at least a second nucleic acid molecule, including, as desired, 

operatively linking two or more (e.g., 2, 3, 4, 5, 6, 7, etc.) of the nucleic acid molecules. 
A metiiod for generating a directionally or non-directionally linked recombinant nucleic 
acid molecule can be p^formed, for example, by contacting a topoisomerase-charged 
first ds nucleic acid molecule, which has a first topoisomerase covalently bound at a first 
20 end, and a second topoisomerase covalently bound at a second end, and also contains a 
5' overhang at the first ead and a blunt end, a 3' uridine overhang, a 3* thymidine 
overhang, or a second 5' overhang at the second end; and a second ds nucleic acid 
molecule, which has a first blunt end and a second end, wherein the first blunt end has a 
5' nucleotide sequence that is complementary to the first 5* overhang of the first end of the 
25 first nucleic acid molecule. The first and second topoisomerases can be the same, for 
example, two type IB topoisomerases, including two Vaccinia type IB topoisomerases, or 
can be diffident, including two type IB topoisomerases fiom dififi^nt organisms or a 
type IB topoisomerase and a type lA or a type n topoisomerase. 

In performing a method of the invention, the first and second ds nucleic acid 
30 molecules are contacted under conditions such that the 5' nucleotide sequence of the 
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second nucleic acid molecule can selectively hybridize to the first 5' overhang, whereby 
the first topoisomerase can covalently link the 3* terminiis of the first end of the first 
ds nucleic acid molecule to the 5' terminus of the first end of the second ds nucleic acid 
molecule, and the second topoisomerase can covalently link the 3' terminus of the second 
S end of the first ds nucleic acid molecule to the 5' temiinus of the second end of the second 
ds nucleic acid molecule, to generate a directionally or non-directionally linked 
recombinant nucleic acid molecule. Accordingly, the present inv^tion provides a 
directionally or non-directionally linked recombinant nucleic acid molecule produced by 
such a method 

10 As disclosed herein, a method of the invention can provide a means to 

directionally link two or more ds nucleotides in a predetemiined directional orientation. 
The term "directionally link " is used herein to refer to the covalent linkage of two or 
more nucleic acid molecules in a particular predetermined order and/or orientation. Thus, 
a method of the invention provides a means, for example, to covalentiy link a promoter 

IS expression control element upstream of a coding sequence, and to covalentiy link a 
polyadenylation signal downstream of the coding region to generate a fimctional 
expressible recombinant nucleic acid molecule; or to covalentiy link two coding 
sequences such that they can be transcribed and translated in fi-ame to produce a fiision 
polypeptide. The term "non-directionally link " is used herein to refer to the covalent 

20 linkage of two or more nucleic acid molecules in a random order, i.e., either of the first or 
second end of the nucleic acid molecule can be linked to an end of another nucleic acid 
molecule. 

The t^rm "substantially blunt," when used in reference to an end of a ds nucleic 
acid molecule, means that the end can be blunt or can have a short oveifaang that does not 
25 reduce or inhibit a strand invasion event by a second nucleic acid molecule having an 
overhang. For example, a substantially blunt end can include an end having an overhang 
of 1, 2, or a few nucleotides, provided the overhang at the substantially blunt end does not 
block strand invasion. For example, flie second ds nucleic acid molecule can have a 
5' adenosine or 5' inosine overhang. 
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It should be recognized that reference herein to a "first nucleic acid molecule," 
"second nucleic acid molecule," "third nucleic acid molecule," and the like, is used only 
to provide a means to indicate which of several nucleic acid molecules is being referred 
to. Thus, absent any specifically defined charactaistic with respect to a particular nucleic 

5 acid molecule, the terms "first," "second," "third" and the like, when used in refermce to 
a nucleic acid molecule, or a population or plurality of nucleic acid molecules, are not 
mtended to indicate any particular ord^, importance or other information about the 
nucleic acid molecule. Thus, where an exenq)lified method refers, for example, to using 
PGR to amplify a first ds nucleic acid molectile such that the amplification product 

10 contains a topoisomerase recognition site at one or both ends, it will be recognized that, 
similarly, a second (or other) ds nucleic acid molecule also can be so an:q>lified. 

The tenn "at least a second," when used in reference to a ds nucleic acid molecule, 
means one or more (e.g., 1, 2, 3, 4, S, 6, 7, 8, 9, 10, etc.) nucleic acid molecules in 
addition to a first ds nucleic acid molecule. Thus, the term can refer to only a second 

IS nucleic acid molecule, or to a second nucleic acid molecule and a third nucleic acid 
molecule (or more). As such, the term "second (or other) ds nucleic acid molecule" or 
"second (and other) ds nucleic acid molecules" is used herein in recognition of the fact 
that the term "at least a second nucleic acid molecule" can refer to a second, third or more 
nucleic acid molecules. It should be recognized that, unless indicated otherwise, a nucleic 

20 acid molecule encompassed within the meaning of the term "at least a second nucleic acid 
molecule" can be the same or substantially the same as a first nucleic acid molecule. For 
example, a first and second ds nucleic acid molecule can be the same except that only one 
of the molecules, for example, the first ds nucleic acid molecule, has a topoisomerase 
recognition site, or except for having complementary 5* overhanging sequences, for 

25 example, produced upon cleavage by a topoisomerase, such that the first and second ds 
nucleic acid molecules can be directionally linked using a method of the invention. As 
such, a method of tiie invention can be used to produce a concatenate of first and second 
ds nucleic acid molecules, which, optionally, can be interspersed, for example, by a third 
ds nucleic acid molecule such as an expression control element, and can contain the 

30 directionally linked sequences in a predetermined directional orientation, for example, 
each in a S' to 3' orientation with respect to each other. 
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It will be recognized that each of the ds nucleic acid molecules, for example, a 
sequence referred to as a first ds nucleic acid molecule, generally comprises a population 
of such nucleic acid molecules, which are identical or substantially identical to each other. 
Thus, it should be clear that the term "different" is used in comparing, for example, a first 

5 (or population of first) ds nucleic acid molecules with a second (and other) ds nucleic acid 
molecule. As used herein, the tenn "dififerent," when used in reference to the ds nucleic 
acid molecules of a composition of the invention, means that the ds nucleic acid 
molecules share less than 95% sequence identity with each when optimally aligned, 
generally less than 90% sequence identity, and usually less than 70% sequence identity. 

10 Thus, ds nucleic acid molecules tiiat, for example, differ only in being polymorphic 

variants of each other or that merely contain differCTt 5' or 3' overhanging sequences are 
not considered to be "different" for purposes of a composition of the invention. In 
comparison, different ds nucleic acid molecules are exemplified by a first sequence 
encoding a polypeptide and second sequence comprising a expression control element, or 

IS a first sequence encoding a first polypeptide a second sequence encoding a 
non-homologous polypeptide* 

The tenn "recombinant" is used herein to refer to a nucleic acid molecule that is 
produced by linking at least two nucleic acid molecules. As such, a recombinant nucleic 
acid molecule encompassed within or generated according to a method of the invention is 

20 distinguishable &om a nucleic acid molecule that may be produced in nature, for 

example, during meiosis. A recombinant nucleic acid molecule generated according to a 
method of the invention can be identified, for example, by the presence of the 
complementary nucleic acid sequence in close proximity, generally directly adjacent, and 
usually direcdy 3', to a topoisomerase binding site in a double stranded nucleic acid 

25 molecule. 

As disclosed herein , a method of tiie invention can be used to directionally or 
non-directionally link a first ds nucleic acid molecule to a second ds nucleic acid 
molecule. In many raibodiments, the method may be used to directionally link a first ds 
nucleic acid molecule and a second (or other) ds nucleic acid molecule. Howcvct, use of 
30 the method to non-directionally link a first ds nucleic acid molecule and a second (or 
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Other) ds nucleic acid molecule also provides advantages. Non-directional linking can be 
performed, for example, 1) where a second nucleotide sequence is present at or near the 
5' terminus of the second end of the first ds nucleic acid molecule, which can fonn all or 
part of a second overhang, and is enable of hybridizing to the 5' complementary 
5 nucleotide sequence of the second ds nucleic acid molecule; and 2) wh^e a nucleotide 
sequence is present at or near the 5' temunus of both the first end and the second end of 
the second ds nucleic acid molecule that is capable of hybridizing to the 5* overhang of 
the first end of the first ds nucleic acid molecule. In these embodhnents, the second end 
of the first ds nucleic acid molecule and the second raid of the second ds nucleic acid 
10 molecule can be either blunt, or include an overhang. 

hi another embodiment, a method for generating a directionally or 
non-directionally linked recombinant nucleic acid molecule can be perfonned, for 
example, by contacting a first precursor ds nucleic acid molecule having a first end, 
which has a first 5' target sequence at the 5* terminus and a topoisomerase recognition site 

15 at or near the 3' terminus, and a second end, which has a topoisomerase recognition site at 
or near tiie 3' terminus; a second ds nucleic acid molecule having a first blunt end and a 
second end, wherem the first blunt end has a 5* nucleotide sequence complementary to flie 
5' target sequence of the first precursor ds nucleic acid molecule; and a topoisomerase that 
is specific for the topoisomerase recognition site. As used herein, reference to a 

20 "precursor" ds nucleic acid molecule means a ds nucleic acid molecule that contains a 
topoisomerase recognition site and that, upon cleavage by a topoisomerase specific for 
the recognition site, produces an end having a desired 5' terminal nucleotide sequence, 
3' terminal nucleotide sequence, or both. Such a desired end is produced, in part, due to 
the presence of the 5* target sequence, which, upon cleavage of the ds nucleic acid 

25 molecule containing the 5' target sequence, is converted to a 5' nucleotide sequmce that 
allows directionally linking the ds nucleic acid molecule to a second ds nucleic acid 
molecule. 



30 



According to a method of the invention, the first ds nucleic acid molecule, second 
ds nucleic acid molecule and topoisomerase are contacted imder conditions that allow 
topoisomerase activity, i.e., such that the topoisomerase can bind to and cleave the 
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recognition site, to produce a topoisomerase-charged 3* terminus, and can ligate the 
3* terminus to an appropriate 5' terminus. Such conditions also allow hybridization of the 
portion of the portion of the first 5' target sequence that remains following cleavage by 
the topoisomerase and the 5' nucleotide sequence of the second ds nucleic acid molecule 
5 that is complementary to that portion of the 5* target sequence. 

In performing a method of the invention, a precursor ds nucleic acid molecule can 
be combined in the same reaction vessel and at the same time with the topoisomerase and 
the second ds nucleic acid molecule before the precursor ds nucleic acid molecule is 
converted into a topoisomerase-charged ds nucleic acid molecule that can be directionaliy 
10 linked to a second ds nucleic acid molecule. By combining the topoisomerase in the same 
reaction vessel as die precursor ds nucleic acid and the second nucleic acid, the methods 
of the present invention are simplified. Alternatively, a first precursor ds nucleic acid 
molecule can be combined with topoisomerase xmder conditions that allow topoisomerase 
cleavage and binding, then a second ds nucleic acid molecule can be added 

15 A precursor ds nucleic acid molecule can be linear or circular, including 

supercoiled, and, as a result of cleavage by one or more topoisomerases and, if desired a 
restriction endonuclease, a linear topoisomerase-charged first ds nucleic acid molecule is 
produced. For example, a circular ds nucleic acid molecule containing two type IB 
topoisomerase recognition sites within about 100 nucleotides of each other and in the 

20 complemmtary strands, preferably within about twenty nucleotides of each other and in 
the complementary strands, can be contacted with a site specific type IB topoisomerase 
such that each strand is cleaved and the intervraing sequence dissociates, thereby 
generating a linear ds micleic acid molecule having a topoisomerase covalently bound to 
each end. 

25 In general, a topoisomerase-charged double stranded nucleic acid, which can be 

directionaliy linked to a second or odier ds nucleic acid molecule, is generated by 
contacting topoisomerase with a precursor ds nucleic acid that has at least one 
topoisomerase recognition site at or near one end and a first target sequence. As used 
herein, the temi "topoisomerase recognition site" means a defined nucleotide sequence 

30 that is recognized and bound by a site specific topoisomerase. For exanq>le, the 
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nucleotide sequence 5'-(C/T)CCTT-3' is a topoisomerase recognition site that is bound 
specifically by most poxvirus topoisomerases, including Vaccinia virus DNA 
topoisomerase I, which then can cleave the strand after the 3 -most thymidine of the 
recognition site to produce a nucleic acid molecule comprising 
5 5 -(C/T)CCTT-P04-T0P0, i.e., a complex of the topoisomerase covalently bound to the 
3' phosphate through a tyrosine residue in the topoisomerase (see Shuman, J. Biol. Chem. 
266:11372-11379, 1991; Sekiguchi and Shuman, NucL Acids Res. 22:5360-5365, 1994; 
each of which is incorporated herein by reference; see, also, U.S. Pat. No. 5,766,891; 
PCT/US95/16099; PCT/US98/12372). 

10 An advantage of constructing a precursor ds nucleic acid molecule to comprise, 

for example, a type IB topoisomerase recognition site about 2 to 15 nucleotides (e.g., 2, 3, 
4, 5, 6, 7, 8, 9, 12, 14, 15, or 20 nucleotides) from one or both ends is that a 5' overhang is 
generated foUowmg cleavage of the ds nucleic acid molecule by a site q)ecific 
topoisomoase. Such a 5* ovezhanging sequence, which would contain 2 to 

15 20 nucleotides, respectively, can be designed using a PGR method as disclosed herein to 
have any sequence as desired. Thus, where a cleaved first ds nucleic acid molecule is to 
be directionally Imked to a selected second (or other) ds nucleic acid molecule according 
to a method of the invention, and where the selected sequence has a 5' oveAanging 
sequence, the 5* overhang on the first ds nucleic acid molecule can be designed to be 

20 complemmtary to the 5' overhang on the selected second (or other) ds sequence such that 
the two (or more) sequences are directionally linked in a predetermined orientation due to 
the complementarity of the 5' overhangs. As discussed above, similar methods can be 
utilized with respect to 3' overhanging sequeaces generated upon cleavage by a type lA or 
type n topoisomerase. 

25 As used herein, the term "cleavage product," when used in reference to a 

topoisomoase recognition site, refers to a nucleic acid molecule that has been cleaved by 
a topoisomerase, generally at its recognition site, and comprises a complex of the 
topoisomerase covalently bound, in the case of a type IB topoisomerase to the 
3' phosphate group of the 3' terminal nucleotide in the topoisomerase recognition site, or, 

30 in the case of type lA or type n topoisomerase, to flie 5' phosphate group of the 
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5' tenninal nucleotide in the topoisomerase recognition site. Such a complex, which 
comprises a topoisomerase cleaved ds nucleic acid molecule having the topoisomerase 
covalently boimd thereto, is referred to herein as a "topoisomerase-activated" or a 
"topoisomerase-charged" nucleic acid molecule, Topoisomerase-activated ds nucleic acid 
5 molecules can be used in a method of the invention, as can ds nucleic acid molecules that 
contain an uncleaved topoisomerase recognition site and a topoisomerase, wherein the 
topoisomerase can cleave the ds nucleic acid molecule at the recognition site and become 
covalently bound thereto. 

As will be readily ^parent from the present disclosure, the ends of ds nucleic acid 

1 0 molecules to be linked according to a method of the invention can have various 

characteristics. For example, in one aspect, the second end of a first precursor ds nucleic 
acid is a blunt end upon cleavage by the topoisomerase, and the second end of a second ds 
nucleic acid molecule is a blunt end hi another aspect, the second end of a first precursor 
ds nucleic acid molecule has a 3* thymidine extension upon cleavage by the 

1 S topoisomerase, and the second end of a second ds nucleic acid molecule has a 

3* adenosine overhang, or the termini can comprise 3* adenosine and 3* deoxyuridine 
overhangs (see U.S. Pat Nos. 5,487,993 and 5,856,144, each of which is mcorporated 
herein by reference). In yet another aspect, a first precursor ds nucleic acid molecule has 
a second 5' target sequence at the second end, and the second end of a second ds nucleic 

20 acid molecule has a 5' nucleotide sequence complementary to at least a portion of the 
second 5' target sequence. The first precursor ds nucleic acid molecule can be a vector, 
including a cloning vector and an expression vector, and, where generally present in a 
circular form, can be linearized due to the action of the topoisomerase, or can be 
linearized by including, for example, one or two restriction endonucleases that linearize 

25 the vector such that, upon contact with the topoisomerase, the first and second ds nucleic 
acid molecules can be directionally linked according to a method of the invention. 

As used herein, the term "at or near," when used in reference to the proximity of a 
topoisomerase recognition site to the 3' (type IB) or 5' (type lA or type II) terminus of a 
nucleic acid molecule, means that the site is within about 1 to 100 nucleotides from the 
30 3' tCTminus or 5' terminus, respectively, gmraally withm about 1 to 20 nucleotides fix)m 



wo 02/16594 



PCT/USOl/26294 



44 

the terminus, and particularly within about 2 to 12 nucleotides fix)m the respective 
terminus. An advantage of positioning the topoisomerase recognition site within about 10 
to 15 nucleotides of a terminus is that, upon cleavage by the topoisomerase, the portion of 
the sequence downstream of the cleavage site can spontaneously dissociate from the 
S remaining nucleic acid molecule, which contains the covalently bound topoisomerase 
(referred to generally as "suicide cleavage"; see, for exan^le, Shuman, supra, 1991; 
Andersen et al., supra^ 1991). Where a topoisomerase recognition site is greater than 
about 12 to IS nucleotides from the terminus, the nucleic acid molecule upstream or 
downstream of the cleavage site can be induced to dissociate from the remainder of the 
1 0 sequence by modifying the reaction conditions, for exan^le, by providing an incubation 
stq) at a temperature above the melting temperature of the portion of the duplex including 
the topoisomerase cleavage site. 

Amethod of the invration using a first precursor ds nucleic acid molecule having 
a S' target sequence and a topoisomerase on a first end and a second end, can be 

1 5 performed to directionally or non-directionally link a fust precursor ds nucleic acid 
molecule to a second ds nucleic acid molecule. The method typically is used to 
directionally link flie first precursor ds nucleic acid molecule and the second ds nucleic 
acid molecule, and also can be used to non-directionally link the first precursor ds nucleic 
acid molecule and the second ds nucleic acid molecule. Non-directional Imlnng can be 

20 performed, for example, 1) where a second nucleotide sequence is present at or near the 
5' terminus of the second end of the first precursor ds nucleic acid molecule, that is 
capable of hybridizing to the 5' complementary nucleotide sequence of the second ds 
nucleic acid molecule; and 2) where a nucleotide sequence is present at or near the 
5' terminus of both the first end and the second end of the second ds nucleic acid 

25 molecule that is capable of hybridizing to the 5' target sequences at or near the first end of 
the first precursor ds nucleic acid molecule. In these embodiments, the second end of the 
first precursor ds nucleic acid molecule and the second end of the second ds nucleic acid 
molecule can be either blunt, or include an overhang. 

A method for generating a directionally or non-directionally linked recombinant 
30 nucleic acid molecule also can be performed by contacting a topoisomerase-charged first 
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ds nucleic acid molecule, which has, at a first end, a first 5' overhang and a first 
topoisomerase covalently bound to the 3* terminus, and a second ds nucleic acid 
molecule, which has a first blunt end and a second end, wherein the first blunt end 
includes a 5' nucleotide sequence complementary to the first 5* overhang. The method is 
5 performed under conditions such that the 5' nucleotide sequence of the first blunt end can 
selectively hybridize to the first 5' overhang, whereby the first topoisomerase can 
covalently link the 3' terminus of the first end of the first ds nucleic acid molecule with 
the 5* terminus of the first end of the second ds nucleic acid molecule. 

Such a method can fiulher include contacting the topoisomerase-charged first ds 

10 nucleic acid molecule and the second ds nucleic acid molecule with a third ds nucleic acid 
molecule, wherein a first end of the third nucleic ds acid molecule has a 5' overhang and a 
second topoisomerase covalently bound at the 3* terminus, and wherein the second ds 
nucleic acid molecule has a second blunt end, which includes a 5' nucleotide sequence 
complementary to the 5' overhang of the third nucleic acid molecule. The contacting can 

15 be performed, for example, under conditions such that the 5' nucleotide sequence of the 
second blunt end of the second ds nucleic acid can selectively hybridize to the 
5' overhang of the first end of flie third ds nucleic acid molecule, whereby the second 
topoisomerase can covalently link the 3' terminus of the first end of the third ds nucleic 
acid molecule with (he 5' terminus of the second blunt end of the second ds nucleic acid 

20 molecule. The first and second topoisom^^es can be the same or different and, if 
desired, the first or third ds nucleic acid molecules, instead of being topoisomerase- 
charged, can contain a topoisomerase recognition site, wherein the method can fiirther 
include contacting the reactants with a topoisomerase. A method of the invention can be 
performed such that the first ds nucleic acid molecule is directionally linked to the second 

25 ds nucleic acid molecule and, thereafter, the third ds nucleic acid molecule is directionally 
or non-directionally linked to the second ds nucleic acid molecule, or all of the reactants 
can be included together at the same time. 



30 



A mefliod of the invention provides a means to render an open reading fi-om a 
cDNA or an isolated genomic DNA sequence expressible by operatively linking one or 
more expression control elemmts to the putative coding sequence. Examples of 
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expression control elements useful in the present invention are disclosed herein and 
include transcriptional expression control elements, translational expression control 
elements, elements that facilitate the transport or localization of a nucleic acid molecule 
or polypeptide in (or out of) a cell, elements that confer a detectable phenotype, and the 
5 like. Transcriptional expression control elements include, for example, promoters such as 
those fiom cytomegalovirus, Moloney leukemia virus, and herpes virus, as well as those 
fiom the genes encoding metallothionein, skeletal actin, phosphoenolpyruvate 
carboxylase, phosphoglycerate, dihydrofolate reductase, and thymidine kinase, as well as 
promoters from viral long terminal repeats (LTRs) such as Rous sarcoma virus LTR; 

10 enhancers, which can be constitutively active such as an immunoglobulin enhancer, or 
inducible such as SV40 chancer, and the like. For example, a metallothionein promoter 
is a constitutively active promoter that also can be induced to a higher level of expression 
upon exposure to a metal ion such as copper, nickel or cadmium ion. In comparison, a 
tetracycline (tet) inducible promoter is an example of a promoter that is induced upon 

15 e)q)osure to tetracycline, or a tetracycline analog, but otherwise is inactive. 

A transcriptional expression control element also can be a tissue specific 
expression control element, for example, a muscle cell specific expression control 
element, such that expression of an encoded product is restricted to the muscle cells in an 
individual, or to muscle cells in a mixed population of cells in culture, for exanqjle, an 

20 organ culture. Muscle cell specific expression control elements including, for example, 
the muscle creatine kinase promoter (Sternberg et al., MoL Cell. Biol. 8:2896-2909, 1988, 
which is incorporated herein by reference) and the myosin light chain enhancer/promoter 
(Donoghue et al., Proc. Natl. Acad. Sci. USA 88:5847-5851, 1991, which is mcorporated 
herein by reference) are well known in the art. Other tissue specific promoters, as well as 

25 expression control elanents only expressed during particular developmental stages of a 
cell or organism are well known in the art 

Expression control or other elements usefiil in generating a construct according to 
a method of the invention can be obtained in various ways. In particular, many of the 
elements are included in conunercially available vectors and can be isolated therefrom 
30 and can be modified to contain a topoisomerase recognition site at one or both ends, for 
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example, using a PGR method as disclosed herein. In addition, the sequences of or 
encoding the elements useful herein generally are well known and disclosed in 
publications. In many cases, the elements, for example, transcriptional and translational 
expression control elements, as well as cell compartmentalization domains, are relatively 
5 short sequences and, therefore, are amenable to chemical synthesis of tbe element or a 
nucleotide sequence encoding the element Thus, in one embodiment, an element 
comprising a composition of the invention, useful in generating a recombinant nucleic 
acid molecule according to a method of the invention, or included within a kit of the 
invention, can be chemically synthesized and, if desired, can be synthesized to contain a 
10 topoisomerase recognition site at one or both ends of the element and, further, to contain 
an overhanging sequence following cleavage by a site specific topoisomerase. 

Where ds nucleic acid molecules are to be directionally linked according to a 
method of the invention, the nucleic acid molecules g^ierally are operatively linked such 
that the recombinant nucleic acid molecule that is generated has a desired structure, 

1 5 performs a desired function, encodes a desired ^ression product, or the like. As used 
herein, the term "operatively linked" means that two or more nucleic acid molecules are 
positioned with respect to each other such that they act as a unit to effect a function 
attributable to one or both sequences or a combination thereof. For example, a nucleic 
acid molecule containing an open reading firame can be operatively linked to a promoter 

20 such that the promoter confers its regulatory effect on the open reading fi^e similarly to 
the way in which it would effect expression of an open reading fi-ame that it normally is 
associated with in a genome in a cell. Similarly, two or more nucleic acid molecules 
comprising open reading fi'ames can be op^atively linked in frame such that, upon 
transcription and translation, a chimeric fusion polypeptide is produced, 

25 A first ds nucleic acid molecule comprising an open reading firame can be 

anq>lified using any amplification meAod, for example, by PGR using a primer pair, to 
generate an amplified first ds nucleic acid molecule having a 5* nucleotide sequence 
complementary to a 5* overfiang of a topoisomerase-charged ds nucleic acid molecule of 
the present invention. Where both ends of the amplified first ds nucleic acid molecule are 

30 complements of two ov^hangs on the topoisomerase-charged ds nucleic acid molecule. 
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the 5' overhangs are diflferent from each other The amplified first ds nucleic acid 
molecule then can be contacted with the topoisomerase-charged ds nucleic acid molecule 
comprising a desired expression control element such as a promoter such that the second 
nucleic acid molecule is operatively linked to the 5' end of the codmg sequence according 
S to a method of the invention. 

Various combinations of components can be used in a method of the invention. 
For example, the method can be performed by contacting a topoisomerase-activated first 
ds nucleic- acid molecule; a second ds nucleic acid molecule having a first end and a 
second end, wherein at the first end or second end or both, the second nucleic acid 

10 molecule has a hydroxyl group at the 5' terminus of the same end; and a 5* overhang. 
Where the 5* terminus of one or both ends to be linked has a 5* phosphate group, a 
phosphatase also can be contacted with the components of the reaction mixture. Upon 
contacting, the phosphatase, if necessary, can generate a 5' hydroxyl group at the same 
end, and the second ds nucleic acid molecule then can be directionally linked to the 

IS topoisomerase-activated first ds nucleic acid molecule. The skilled artisan will recognize 
other combinations of components usefiil for performing a method of tiie invention. 

As used herein, reference to contacting a first nucleic acid molecule and a second 
nucleic acid molecule "under conditions such that all con^onents are in contact" means 
that the reaction conditions are appropriate for a topoisom^:ase*cleaved end of a first ds 

20 nucleic acid molecule to come into sufficient proximity such that a topoisomerase can 
effect its enzymatic activity and covalentiy link the 3' terminus of the first ds nucleic acid 
molecule to a 5* hydroxyl group at the terminus of a second nucleic acid molecule. 
Examples of such conditions include, for example, the reaction temperature, ionic 
strength, and pH. Additionally, such conditions can be determined eD[q)irically or using 

25 formulas that predict conditions for specific hybridization of nucleic acid molecules, as is 
well known in the art (see, for example, (Sambrook et al.. Molecular Cloning: A 
laboratorv manual (Cold Spring Harbor Laboratory Press 1989); Ausubel et al.. Current 
Protocols in Molecular Biologv, John Wiley and Sons, Baltimore, MD (1987, and 
supplements throu^ 1995), each of which is incorporated herein by refereice). 
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As disclosed herein, a PGR method using primers designed to incorporate 
complementary nucleotide sequences at one or both ends of an amplified ds nucleic acid 
molecule provides an example of a convenient means for producing ds nucleic acid 
molecules useful in a method of the invention. At least one of the primers of a primer 
5 pair is designed such that it comprises, in a 5* to 3' orientation, a nucleotide sequence 
complementary to a first overhang on the topoisomerase-charged ds nucleic acid molecule 
of the present invention. The second primer of the PGR primer pair can be 
complementary to a desired sequence of the nucleic acid molecule to be amplified, and 
can comprise a second complementary sequ^ce. 

10 A primer can contain or encode any other sequence of interest, including, for 

example, a site specific integration recognition site such as an att site, a lox site, or the 
like, or, as discussed above, can simply be used to introduce a topoisomerase recognition 
site into a ds nucleic acid molecule comprising such a sequence of interest. A 
recombinant nucleic acid molecule generated according to a method of the invention and 

IS containing a site specific integration recognition site such as an att site or lox site can be 
integrated specifically into a desired locus such as into a vector, a gene locus, or the like, 
that contains the required integration site, for example, an att site or lox site, respectively, 
and upon contact with the appropriate enzymes required for the site specific event, for 
example, lambda Iht and IHF proteins or Gre recombinase, respectively. The 

20 mcorporation, for example, of attB or attP sequences into a directionally or non- 

directionally linked recombinant nucleic acid molecule according to a method of the 
invention allows for the convenient manipulation of the nucleic acid molecule using the 
GATEWAY™ Gloning System (Invitrogen Gorp., La JoUa GA). A first ds nucleic acid 
molecule used in a method of the invention can be a linearized vector containing two site 

25 specific integration sites, for example, an "entry vector" (GATEWAY™ Gloning System), 
and a method of the invention can be used to insert a second (or other) ds nucleic acid 
molecule between the site specific integration sites. 



30 



A ds nucleic acid molecule to be used in a method or kit of the invration can be 
amplified using any ampUfication reaction, for example, using the polymerase chain 
reaction (PGR), to contain a complementary nucleotide sequence at a 5* end. Although 
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exemplified by PGR, other amplification methods also can be used to amplify a nucleic 
acid molecule such that the amplified nucleic acid molecule has a complementary 
sequence at the 5' terminus of one of its ends. The complementary nucleotide sequence is 
complementary to the 5' overhang on the topoisomerase-charged ds nucleic acid to which 
5 the amplified nucleic acid will be ligated. This complementarity facilitates the 

association of the nucleic acid molecules in a predetennined orientation, whereupon they 
can be linked by topoisomerase according to a method of the mvention. 

Amplification primers can be designed to impart particular characteristics to a 
desired ds nucleic acid molecule, for example, a ds nucleic acid molecule that encodes a 
10 transcriptional or translational expression control element or a coding sequence of interest 
such as an epitope tag or cell compartmentalization domain. In this aspect, the 
amplification primers are designed such that, upon amplification, the ds nucleic acid 
molecule contains a 5' complementary sequence at one or both ends, as desired. 

Amplification primers also can be used to amplify a directionally linked 
1 5 recombinant nucleic acid molecule generated according to a method of the invention. For 
example, a method of the invention can generate firom three ds nucleic acid molecules, 
including a nucleic acid molecule comprising a promoter, a nucleic acid molecule 
comprising a coding sequence, and a nucleic acid molecule comprising a polyadenylation 
signal, an expressible recombinant nucleic acid molecule. The gen^tion of the nucleic 
20 acid molecule is facilitated by the incorporation of complementary 5' (or 3*) sequences at 
the ends of the ds nucleotides sequences to be joined, wherein preferably one of the 
complementary sequences is an overhang sequence. 

As such, by designing a PGR primer pair containing a first primer that is specific 
for an overhang of the nucleic acid molecule comprising the promoter tiiat is upstream 

25 fi'om the promoter, and a second primer that is specific for an overhang of the nucleic acid 
molecule comprising ihc polyadmylation signal that is down stream of the signal, only a 
fiill lengfli functional recombinant nucleic molecule containing the promoter, coding 
sequence and polyadenylation signal in the correct (predetennined) orientation will be 
amplified, hi particular, partial reaction products, for example, containing only a 

30 promoto* linked to the coding sequence, and reaction products containing nicks are not 
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amplified. Thus, PGR can be used to specifically design a ds nucleic acid molecule such 
that it is useful in a method of the invention, and to selectively amplify only those 
reaction products having the desired components and charact^stics. 

Amelhod of the invention can be perfomied such that a second ds nucleic acid 
5 molecule to be directionally Ugated to a first ds nucleic acid molecule, is one of a 

plurality of second ds nucleic acid molecules. As used herein, the term "plurality," when 
used in reference a fixst or at least a second nucleic acid molecule, means that the nucleic 
acid molecules are related but different For purposes of the present invention, the 
nucleic acid molecules of a pluraUty are "related" in that each nucleic acid molecule in the 

10 plurality contains, for example, a 5' nucleotide sequence that is complementary to a 
5' overhang sequence present on a topoisomerase-charged ds nucleic acid molecule to 
which the second ds nucleic acid molecules are to be directionally linked. Furthermore, 
the nucleic acid molecules of a plurality are "dififerrat" in fliat they can comprise, for 
example, a cDNA library, a combinatorial library of nucleic acid molecules, a variegated 

IS population of nucleic acid molecules, or the like. Methods of making cDNA libraries, 
combinatorial libraries, libraries comprising variegated populations of nucleic acid 
molecules, and the like are well known in the art (see, for example, U.S. Pat No. 
5,837,500; U.S. Pat. No. 5,622,699; U.S. Pat No. 5,206,347; Scott and Smith, Science 
249:386-390, 1992; Maridand et al., Gme 109:13-19, 1991; O'Connell et al., Proc. Natl, 

20 Acad. Sci., USA 93:5883-5887, 1996; Tu^k and Gold, Science 249:505-510, 1990; Gold 
et al., Ann. Rev. Biochem. 64:763-797, 1995; each of which is incorporated herein by 
reference). 

Where a second ds nucleic acid molecule is one of a population of ds nucleic acid 
molecules, a method of the invention can further utilize a population of first ds nucleic 

25 acid molecules, each of which contains a different and randomly generated nucleotide 
sequence at or near a 3* and/or 5* termmus of a first and/or second end, for example, 
randomly generated 3* or 5' overhangs at or near a first end. Such a population of 
randomly generated nucleotide sequences near an end will include complementary 
sequences to nucleotide sequences at or near the ends of many or all of the second ds 

30 nucleic acid molecules of the plurality. Thus, the mettiod can be used to generate hnked 
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recombinant nucleic acid molecules, including many or all of the nucleic acid molecules 
in the plurality of second ds nucleic acid molecules. 

The methods of the invention have broad application to the field of molecular 
biology. As discussed in greater detail below, the methods of the invention can be used, 
5 for example, to label DNA or RNA probes, to generate sense or antisense RNA, to 
prepare bait or prey constructs for performing a two hybrid assay, to prepare linear 
expression elements, to prepare constructs useful for coupled in vitro 
transcription/translation assays, and to perform directional cloning. 

A directionally or non-directionally linked recombinant nucleic acid molecule 
10 generated according to this aspect of the invention can be linear, but preferably is circular, 
particularly a vector, as described above. The circular recombinant nucleic acid molecule 
can be generated such that it has the characteristics of a vector, and contains, for example, 
expression control elements required for replication in a prokaryotic host cell, a 
eukaryotic host cell, or both, and can contain a nucleotide sequence encoding a 
IS polypeptide that confers antibiotic resistance or the like. 

A method of the invention can further include introducing a directionally or non- 
directionally linked recombinant nucleic acid molecule into a cell, which can be a 
prokaryotic cell such as a bacterimn or a eukaryotic cell such as a mammalian cell. 
Accordingly, the present invention also provides a cell produced by a method of the 

20 invention, as well as a non-human transgenic organism produced firom such a cell. An 
advantage of such a method is that the generated recombinant nucleic acid molecule, 
which is circularized according to a method of the invention, can be transformed or 
transfected into an appropriate host cell, wherein the construct is amplified. Thus, an in 
vivo method using a host cell can be used for obtaining a large amount of a circularized 

25 product generated according to a method of the invention. 

It should be recognized that a method of the invention is characterized, in part, in 
that a linked recombinant nucleic acid molecule generated thereby either contains a nick 
in one strand, because a topoisomanse is attached to only one 3temiinus of the ends to be 
linked, or comprises one strand that is derived con^>letely firom only one of two nucleic 
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acid molecule linked according to the method. Where the recombinant nucleic acid 
molecule contains a nick, the nick can be converted to a phosphodiester bond, if desired, 
for example, by contacting the nicked recombinant nucleic acid molecule with a DNA 
ligase, by mtroducing the nicked recombinant nucleic acid molecule into a cell such as a 
5 bacterium that can repair the nick, or by any o&er method as desired. Thus, in one 
embodiment, a method of the invention includes a strand invasion event and a ligation. 

Where a recombinant nucleic acid molecule generated according to a method of 
the invention does not comprise one strand that is derived completely from only one of 
the starting nucleic acid molecules, the method can further include a cleavage stq), 

1 0 wherein the displaced nucleotide sequence is cleaved from the product. Such a cleaving 
step can be performed using any method known to cleave or degrade a single stranded 
nucleotide sequence, including for example, contacting a recombinant nucleic acid 
molecule comprising the displaced strand with an enzyme having, S* to 3' or 3* to 5' single 
stranded nucleic acid exonuclease activity (depending on the orientation of the displaced 

IS strand). Such a method conveniently can be performed m viTro. Alternatively, the 

recombinant ds nucleic acid molecule can be introduced into a cell, for example an E. coli 
cell, wherein the displaced nucleotide sequence is cleaved. 

A method of the invration can be used to generate a directionally linked 
recombinant nucleic acid molecule encoding a chimeric fusion polypeptide. For 

20 generating such a recombinant nucleic acid molecule, a first and second (or other) 

ds nucleic acid molecule each can encode all or a portion of an open reading firame, and 
the first and second (or other) ds nucleic acid molecules, which have first and/or second 
ends as disclosed herein, are directionally linked. The chimeric polypeptide can comprise 
a fusion polypeptide, in which the two (or more) encoded peptides (or polyp^tides) are 

25 translated into a single product, i.e., the peptides are covalently linked through a peptide 
bond. 

For example, a first ds nucleic acid molecule can encode a cell 
compartmentalization domain, such as a plasma membrane localization domain, a nuclear 
localization signal, a mitochondrial mmibrane localization signal, an endoplasmic reticulum 
30 localization signal, or the like, or a protein transduction domain such as the human 
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immunodeficiency vims TAT protein transduction domain, which can facilitate translocation 
of a peptide linked thereto into a cell (see Schwarze et al.. Science 285: 1 569-1572, 1999; 
Derossi et al., L Biol. Chem. 271:18188, 1996; Hancock et al., EMBO J. 10:4033-4039, 
1991; Buss et al., Mol. Cell. Biol. 8:3960-3963, 1988; U.S. Pat. No. 5,776,689 each of 
5 which is incorporated herein by reference). Such a domain can be useful to target a fusion 
polypeptide comprising the domain and a polypeptide encoded by a second ds nucleic add 
molecule, to which it is directionally linked according to a method of the invention, to a 
particular compartmeat in the cell, or for secretion from or entry into a cell. As such, the 
invention provides a means to generate directionally linked recombinant nucleic acid 
10 molecules that encode a chimeric polypeptide. 

A fusion polypeptide expressed firom a directionally linked recombinant nucleic 
acid molecule generated according to a method of the invention also can comprise a 
peptide having the characteristic of a detectable label or a tag such that the expressed 
fusion polypeptide can be detected, isolated, or the like. For example, a first, second or 

1 5 other ds nucleic acid molecule containing a topoisomerase recognition site, or cleavage 
product thereof, as disclosed herein, can encode an enzyme such as alkaline phosphatase, 
d-galactosidase, chloramphenicol acetyltransferase, luciferase, or other enzyme; or can 
encode a peptide tag such as a polyhistidine sequence (e.g., hexahistidine), a V5 epitope, 
a c-myc epitope; a hemagglutinin A epitope, a FLAG epitope, or the like. Expression of a 

20 fusion polypeptide comprising a detectable label can be detected using the appropriate 
reagent, for example, by detecting light emission i^on addition of luciferin to a fiision 
polypeptide comprising luciferase, or by detecting binding of nickel ion to a fusion 
polypeptide comprising a poljdiistidine tag. 

A polyhistidine tag can comprise Scorn about two to about ten contiguous histidine 
25 residues (e.g., two, three, four, five, six, seven, eight, nine, or ten contiguous histidine 
residues). The tag can also be a pq)tide tag which binds nickel ions, as well as other 
metal ions (e.g., copper ion), and can be used for metal chelate aflBnity chromatography. 
Examples of such tags include peptides having tiie formula: Ri-(His-X)jj-R2^ wherein 
(His-}Qn represents a metal chelating pq)tide and n is a number between two through ten 
30 (e.g., two, three, four, five, six, seven, eight, nine, or tra), and X is an amino acid selected 
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from the group consisting of alanine, arginine, aspartic acid, asparagine, cysteine, 
glutamic acid, glutamdne, glycine, histidine, iso-leucine, leucine, lysine, methionine, 
phenylalanine, proline, serine, threonine, tryptophan, tyrosine and valine. Further, R2 
may be a polypeptide which is covalently linked to the metal chelating peptide and Rl 
5 may be either a hydrogen or one or more (e.g., one, two, tiiree, four, five, six, seven, 
ei^t, nine, ten, twenty, thirty, fifty, sixty, etc.) amino acid residues. Li addition, Rl may 
be a polypeptide which is covalently linked to the metal chelating peptide and R2 may be 
either a hydrogen or one or more (e.g., one, two, tiiree, four, five, six, seven, eight, nine, 
ten, twenty, tiiirty, fifty, sixty, etc.) amino acid residues. Tags of this nature are described 
10 in U.S. Pat. No. 5,594,1 15, tiie entire disclosure of which is incorporated herein by 
reference. 

Similarly, isolation of a fiision polypeptide comprising a tag can be performed, for 
example, by passing a fiision polypeptide comprising a myc q)itope over a column having 
an anti-c-myc epitope antibody bound thereto, then eluting the bound fiision polypeptide, 

15 or by passing a fiision polypeptide coirq)rising a polyhistidine tag over a nickel ion or 
cobalt ion affinity column and eluting the bound fiision polypeptide. Methods for 
detecting or isolating such fiision polypeptides will be well known to those in the art, 
based on flie selected detectable label or tag (see, for example, Hopp et al., 
BioTechnology 6:1204, 1988; U.S. Pat. No. 5,011,912; each of which is incorporated 

20 herein by reference). 

In one embodiment, tiie directionally linked recombinant nucleic acid molecules 
encode chimeric polypeptides usefiil for performing a two hybrid assay. In such a 
method, the first ds nucleic acid molecule encodes a polypeptide, or a relevant domain 
thereof, that is suspected of having or being examined for the ability to interact 

25 specifically with one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) other polypeptides. 
The second ds nucleic acid molecule, to which the first ds nucleic acid molecule is to be 
directionally linked according to a method of the invention, can encode a transcription 
activation domain or a DNA binding domain. For example, a first ds nucleic acid 
molecules to be directionally linked is modified, for example, to contain a 5' ove±ang on 

30 a first end and a topoisomerase recognition site, or cleavage product thereof at or near the 
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first end. A second ds nucleic acid molecules to be linked contains, or is modified to 
contain, a 5' sequence complementary to the 5' overhang at the first end of the first 
ds nucleic acid molecule. Upon contact of the first and second ds nucleic acid molecules 
with a topoisomerase, the directionally linked nucleic acid molecxile encodes a first 
5 hybrid useful for performing a two hybrid assay (see, for example. Fields and Song, 
Nature 340:245-246, 1989; U.S. Pat No. 5,283,173; Fearon et al., Proc. Natl. Acad. Sci.; 
USA 89:7958-7962, 1992; Chien et al., Proc. Natl. Acad. Sci. USA 88:9578-9582, 1991; 
Young, Biol. Reprod. 58:302-31 1(1998), each of which is incorporated herem by 
reference). Similar methods are used to generate the second hybrid protein, which can 

1 0 comprise a plurality of polypeptides to be tested for the ability to interact witii the 

polypeptide, or domain thereof of the first hybrid proteirt Such methods similarly can be 
used to construct directionally linked nucleic acid molecules encoding fusion protein 
usefiil for a modified form of a two hybrid assay such as the reverse two hybrid assay 
(Leanna and Hannmk, Nucl. Acids Res. 24:3341-3347, 1996, which is incorporated 

1 5 herein by reference), the repressed transactivator system (U.S. Pat. No. 5,885,779, which 
is incorporated herein by reference), the protein recruitment system (U.S. Pat. 
No. 5,776,689, which is incorporated herein by reference), and the like. 

As disclosed herein, a second ds nucleic acid molecule can be one of a plurality of 
nucleic acid molecules, for example, a cDNA library, a combinatorial library of nucleic 

20 acid molecules, or a population of variegated nucleic acid molecules. As such, the 
methods of the invention are particularly usefiil for generating recombinant 
polynucleotides encoding chimeric polypeptides for performing a hi^ throughput two 
hybrid assay for identifying protein-protein interactions that occur among populations of 
polypeptides (see U.S. Pat No. 6,057,101 and U.S. Pat. No. 6,083,693, each of which is 

25 incorporated herein by reference). In such a method, each of the hybrid proteins of the 
two hybrid assay is generated using a different one of two populations (pluralities) of 
nucleic acid molecules encoding polypeptides, each plurality having a complexity of &om 
a few related but different nucleic add molecules to as hig^ as tens of thousands of such 
molecules. By performing a mefliod of flie invention, for exanq)le, using a PGR primer 

30 pair to an^liiy each nucleic acid molecule in a plurality, directionally linked recombinant 
polynucleotides encoding a population of chim^c bait polypeptides and a population of 



wo 02/16594 



PCTAJSOl/26294 



57 

chimeric prey polypeptides readily can be generated. Such populations are generated by 
contacting the amplified pluralities of nucleic acid molecules, each of which comprises an 
appropriate end, with a topoisomerase and a nucleic acid molecule which contains a 
topoisomerase recognition site at or near its ends and encodes a transcription activation 
S domain or a DNA binding domain. 

A first ds nucleic acid molecule usefiil iti a method of the invention also can 
encode a ribonucleic acid (RNA) molecule, which can fimction, for example, as a 
riboprobe, an antisense nucleic acid molecule, a ribozyme, or a triplexing nucleic acid 
molecule, or can be used in an vitro translation reaction, and the second ds nucleic acid 

10 molecule can encode an expression control element usefiil for expressing an RNA &om 
the first nucleic acid molecule. For example, where it is desired to produce a large 
amount of RNA, a second ds nucleic acid molecule component for performing a method 
of the invention can comprise an RNA polymerase promoter such as a T7, T3 or SP6 
RNA polymerase promoter. Where the RNA molecule is to be expressed in a cell, for 

1 5 example, an antisense molecule to be expressed in a mammalian cell, the second (or 

other) ds nucleic acid tnolecule can include a promoter that is active in a mammalian cell, 
particularly a tissue specific promoter, which is active only in a tai^get cell. Furthermore, 
where the RNA molecule is to be translated, for example, in a coiq)led in vitro 
transcription/translation reaction, the first nucleic acid molecule or second (or other) 

20 nucleic acid molecule can contain appropriate translational expression control elements. 

A directionally or non-directionally linked recombinant nucleic acid molecule 
generated according to a method of tiie invention can be used for various purposes for 
which recombinant vectors containing a directionally or non-directionally inserted nucleic 
acid molecule are generally used. Thus, the directionally or non-directionally linked 

25 nucleic acid molecule can be used, for example, for expressing a polypeptide in a cell, for 
diagnosing or treating a pathologic condition, or the like. For administration to a Uving 
subject, the directionaUy or non-directionally linked recombinant nucleic acid molecule 
generally is formulated in a pharmaceutical composition suitable for administration to the 
subject Thus, the inv^tion provide pharmaceutical compositions containing a 

30 directionally or non-directionally linked recombinant nucleic acid molecule gen^ted 
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according to a method of the invention and expression products of this nucldc acid 
molecule. As such, the nucleic acid molecule is usefid as a medicament for treating a 
subject suff^ing &om a pathological condition. 

Phannaceutically acceptable earners are well known in the art and include, for 
5 example, aqueous solutions such as water or physiologically buffered saline or other 
solvents or vehicles such as glycols, glycerol, oils such as olive oil or injectable organic 
esters. Apharmaceutically acceptable carrier can contain physiologically acceptable 
compounds that act, for example, to stabilize or to increase the absorption of the 
conjugate. Such physiologically acceptable compounds include, for example, 

10 carbohydrates, such as glucose, sucrose or dextrans, antioxidants, such as ascorbic acid or 
glutathione, chelating agents, low molecular weight proteins or oth^ stabilizers or 
excipients. One skilled in the art would know that the choice of a phaimaceutically 
acceptable carrier, including a physiologically acceptable compound, dq)ends, for 
example, on the route of administration of the con:q)osition, which can be, for example, 

15 orally or parenterally such as intravenously, and by injection, intubation, or other such 
method known in the art. The pharmaceutical composition also can contain a second 
reagent such as a diagnostic reagent, nutritional substance, toxin, or therapeutic agent, for 
example, a cancer chOTiotherapeutic agent 

The directionally linked recombinant nucleic acid molecule can be incoiporated 
20 within an ^capsulating material such as into an oil-in-water emulsion, a microemulsion, 
micelle, mixed micelle, liposome, microsphere or other polymer matrix (see, for exaixiple, 
Gregoriadis, Liposome Technology, Vol. 1 (CRC Press, Boca Raton, FL 1984); Fraley, et 
al.. Trends Biochem. Sci., 6:77 (1981), each of which is incorporated herem by 
reference). Liposomes, for example, which consist of phospholipids or other lipids, are 
25 nontoxic, physiologically acceptable and metabolizable carriers that are relatively simple 
to make and administer. "Stealth" liposomes (see, for example, U.S. Pat Nos. 5,882,679; 
5,395,619; and 5,225,212, each of which is incorporated herein by reference) are an 
example of such encapsulating materials particularly useful for preparing a 
pharmaceutical composition, and other "masked" liposomes similarly can be used, such 
30 liposomes extending the time that a nucleic acid molecule remains in the circulation. 
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Cationic liposomes, for example, also can be modified with specific receptors or ligands 
(Morishita et al, J. Clin. Invest., 91 :2580-2585 (1993), which is incorporated herein by 
reference). The nucleic acid molecule also can be introduced into a cell by complexing it 
with an adenovirus-polylysine complex (see, for example, Michael et al., J. Biol. Chem. 
5 268:6866-6869 (1993), which is incorporated herein by reference). Such compositions 
can be particularly useful for introducing a nucleic acid molecule into a cell in vivo or in 
vitro^ including ex vrvo, wherein the cell containing the nucldc acid molecule is 
administered back to the subject (see U.S. Pat. No, 5,399,346, which is incorporated 
herein by reference). A nucleic acid molecule generated according to a method of the 
10 invention also can be introduced into a cell using a biolistic method (see, for example, 
S}4ces and Johnston, supra, 1999). 

As disclosed herein, a directionally linked nucleic acid molecule generated 
according to a method of the invention contains a nick, which can be resolved, for 
example, by contacting the nicked recombinant nucleic acid molecule with a ligase. Such 

15 a directionally linked recombinant nucleic acid molecule that is covalently linked in both 
strands can be used as a template for an aimplification reaction such as PGR. As such, a 
large amount of the construct can be generated. Furthermore, an amplification reaction 
can provide an in vitro selection method for obtaining only a desired product, without 
obtaining partial reaction products. For example, a method of the invmtion can be used 

20 to generate a directionally linked recombinant nucleic acid molecule comprising, 

operatively linked in a 5' to 3' orientation, a first ds nucleic acid molecule comprising a 
promoter, a second ds nucleic acid molecxile comprising a coding region, and a third 
ds nucleic acid molecule comprising a polyadenylation signal, wherein the nicks in ttie 
generated recombinant nucleic acid molecule are ligated. 

25 By selecting a PGR primer pair including a first primer complementary to a 

nucleotide sequence upstream of the promoter sequence, and a second primCT 
complementary to a nucleotide sequence downstream of the polyadenylation signal, a 
fimctional amplification product comprising the promoter, coding region and 
polyadraylation signal can be geno-ated. In contrast, partial reaction products that lack 

30 either the first ds nucleic acid molecule or third ds nucleotide are not amplified because 
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either the first or second primer, respectively, will not hybridize to the partial product. In 
addition, a construct lacking the second ds nucleic acid molecule woxild not be generated 
due to the lack of complementarity of the 5' overhanging sequences of the first and third 
ds nucleic acid molecules. As such, the invention provides, in part, a means to obtain a 
5 desired functional, directionally linked recombinant nucleic acid molecule. 

The use of an amplification reaction such as PCR in such a manner fiirfher 
provides a means to screen a large number of nucleic acid molecules generated according 
to a method of the invention in order to identify constructs of interest. Since methods for 
utilizing PCR in automated high throughput analyses are routine and well known, it will 

10 be recognized that the methods of the invention can be readily adapted to use in a high 
throughput system. Using such a system, a large number of constructs can be screened in 
parallel, and partial or incomplete reaction products can be identified and disposed of, 
thereby preventing a waste of time and expense that would otherwise be required to 
characterize the constructs or examine the fimctionality of the constructs in fiirther 

IS experiments. 

Recombinant nucleic acid molecules generated by a method of the invention 
wherein the first ds nucleic acid molecule contains a first topoisomerase but not a second 
topoisomerase or topoisomerase binding site, are generally linear, whereas, in other 
aspects, the methods of the invention can generate circular recombinant nucleic acid 

20 . molecules. However, a directionally linked recombinant nucleic acid molecule that is 
generated as a linear molecule can be circularized, for example, where it is to be used as a 
vector. In addition, a linear directionally linked recombinant nucleic acid molecule 
generated by a metiiod of the invmtion can be cloned into a vector, which can be a 
plasmid vector or a viral vector such as a bacteriophage, baculovirus, retrovirus, 

25 lentivirus, adenovirus, Vaccinia virus, semliki forest virus and adeno-associated virus 
vector, all of which are well known and can be purchased fix)m commercial sources 
(InvitrogCT Corp., La Jolla CA; Promega, Madison WI; Stratagene, La Jolla CA; 
GIBCO/BRL, Gaithersburg MD). 



30 



The methods of the invention also can be used to detectably label a nucleic acid 
molecule witii a chemical or small organic or inor^^mic moiety such that the nucleic acid 
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molecule is xiseful as a probe. For example, a first ds nucleic acid molecule, which has a 
topoisomerase recognition site, or cleavage product thereof, at a 3' terminus, can have 
bound thereto a detectable moiety such as a biotin, which can be detected using avidin or 
streptavidin, a fluorescent compound (e.g., Cy3, Cy5, Fam, fluorescein, or rhodamine), a 
5 radionuclide (e.g., sulfiir-35, technicium-99, phosphorus-32, or tritium), a paramagnetic 
spin label (e.g., carbon-13), a chemiluminescent compound, an epitope, for example a 
peptide q>itope, which can be detected using an antibody tfiat recognizes the epitope, or 
the like, such that, upon generating a directionally linked double stranded recombinant 
nucleic acid molecule according to a method of the invention, the generated nucleic acid 

10 molecule will be labeled. Methods of detectably labeling a nucleic acid molecule with 
such moieties are well known in the art (see, for example, Hermanson, "Bioconjugate 
Techniques" (Academic Press 1996), which is incorporated herein by reference). It 
should be recognized that such elements as disclosed herein or otherwise known in the 
art, including nucleic acid molecules encoding cell compartmentalization domains, or 

15 detectable labels or tags, or comprising transcriptional or translation expression control 
elements can be useful components of a kit as disclosed herein. 

A method of the invention, in which a first ds nucleic acid molecule mHi a first 
topoisomerase, but not a second topoisomerase or topoisomerase recognition site, is used 
can be particularly usefiil for generating an expressible recombinant nucleic acid 

20 molecule that can be inserted in a site specific manner into a taiget DNA sequmce. The 
target DNA sequence can be any DNA sequence, particularly a genomic DNA sequence, 
and preferably a gene for which some or all of the nucleotide sequence is known. The 
method can be performed utilizing a first ds nucleic acid molecule, which has a first end 
and a second end and encodes a polypeptide, for example, a selectable marker, wherein 

25 the first ds nucleic acid molecule comprises a complementary sequmce at both ends; and 
directionally linking the first ds nucleic acid molecule to first and second PCR 
amplification products, which are generated fi^om sequences upstream and downstream of 
the site at which the construct is to be inserted, wherein each amplification products each 
contain a topoisomerase recognition site a^d a 5' target sequence selected based on the 

30 Motors set forth in the present disclosure. For example, the first and second anq)lification 
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products have dififerent 5* target sequences such that, upon cleavage by a topoisomerase, 
each can be Unked to a predetermined end of the first ds nucleic acid molecule. 

The first and second amplification products are generated using two sets of PGR 
primer pairs. The two sets of PGR primer pairs are selected such that, in the presence of 
5 an appropriate polymerase such as Taq polymerase and a template comprising the 

sequences to be ampUfied, the primers amplify portions of a target DNA sequence that are 
upstream of and adjacent to, aad downstream of and adjacent to, the site for insertion of 
the selectable marker. In addition, the sets of PGR primer pairs are designed such that the 
amplification products contain a topoisomerase recognition site and, following cleavage 

10 by the site specific topoisomerase, a 5* overhanging sequence at the end to be 

directionally linked to the selectable marker. As such, the first PGR primer pair includes 
1) a fixst primer, which comprises, in an orientation fix)m S' to 3', a nucleotide sequence 
complementary to a 5* conq>lementary sequence of the end of the selectable marker to 
which the amplification product is to be durectionally linked, a nucleotide sequence 

1 5 complementary to a topoisomerase recognition site, and a nucleotide sequence 

complementary to a 3* sequence of a target DNA sequence i5)stream of the msertion site; 
and 2) a second primer, which con:q>rises a nucleotide sequence of the target genomic 
DNA upstream of the 3' sequence to which the first primer is complementary, i.e., 
downstream of tiie insertion site. The second PGR primer pair includes 1) a first primer, 

20 which comprises, from 5' to 3*, a nucleotide sequence complementary to the 

5* complementary sequence of the end of the selectable marker to which it is to be 
directionally linked, a nucleotide sequence complementary to a topoisomerase recognition 
site, and a nucleotide sequence of a 5' sequence of a target DNA sequence, wherein the 
5' sequence of the target genomic DNA is downstream of tiie 3* sequence of the target 

25 DNA sequence to which the first primer of the first PGR primer pair is complementary; 
and the second primer of the second primer pair comprises a nucleotide sequence 
complementary to a 3' sequence of the target DNA sequence that is downstream of the 
S* sequence of tiie target genomic DNA contained in the first primer. The skilled artisan 
will recognize that the sequences of the prhn^ that are compl^entary to tiie target 

30 genomic DNA are selected based on the sequmce of the target DNA. 
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Upon contact of the first ds nucleic acid molecule comprising the selectable 
marker, the JSrst and second amplification products (i.e., second and third ds nucleic acid 
molecules), and a topoisomerase (unless the molecules are topoisomerase-charged), a 
directionally linked recombinant nucleic acid molecule is gen^-ated. Following ligation 
5 of the nicks, the generated recombinant nucleic acid molecule can be further amplified, if 
desired, using PGR primers that are specific for an upstream and downstream sequence of 
the target genomic DNA, thus ensuring that only fimctional constructs are amplified. 
Such a gen^ted directionally linked recombinant nucleic acid molecule is usefiil for 
performing homologous recombination in a genome, for example, to knock-out the 
1 0 fimction of a gene in a cell, or to confer a novel phenotype on the cell containing the 

generated recombinant nucleic acid molecule. The method can fiirther be used to produce 
a transgenic non-human organism having tfie generated recombinant nucleic acid 
molecule stably maintained in its genome. 

A method of the invention involving a first ds nucleic acid having a topoisomerase 
IS or topoisomerase-recognition site, for example, at a first end but not flie second end, also 
can be usefiil for covalently linking an ad^ter or linker sequence to one or both ends of a 
second ds nucleic add molecule of interest, including to each of a plurality of second (or 
other) ds nucleic acid molecules. For example, where it is desired to put linkers on both 
ends of a first ds nucleic acid molecule, the method can be performed by contacting a 
20 topoisomerase witii a first ds nucleic acid molecule, which has a 5' complementary 

sequence at or near each 5' terminus that is complmientary to an overhang sequence on a 
5' terminus of each of the second and third ds nucleic acid molecides; and a second ds 
nucleic acid molecule and a third ds nucleic acid molecule, each of which includes a 
topoisomerase recognition site at the appropriate 3' terminus and a 5' overhang sequence 
25 at or near the end containing the topoisomerase recognition site. An appropriate terminus 
is the terminus to which the linker is to be directionally linked to the first ds nucleic acid 
molecule. In poforming such a metiiod, the linker sequences conq)rising the second and 
at least third nucleic acid molecule can be the same or different. 

. A method of the invention involving a first ds nucleic acid molecule with a 
30 5' target sequence and a topoisomerase on a first end, can be performed to directionally or 
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non-directionally link a first ds nucleic acid molecule to at least a second ds nucleic acid 
molecule. The method typically is used to directionally link the first ds nucleic acid 
molecule and the second ds nucleic acid molecule. However, the method can be used to 
non-directionally link the first ds nucleic acid molecule and the second ds nucleic acid 
5 molecule in the following embodiments: 1) Where a second nucleotide sequence is 
present at or near the 5* terminus of the second end of the first ds nucleic acid molecule, 
that is capable of hybridizing to the 5* complementary nucleotide sequence at the second 
end of the second ds nucleic acid molecule; and 2) Where a nucleotide sequence is 
present at or near the 5* terminus of both the first end and the second end of the second ds 
10 nucleic acid molecule that is cq>able of hybridizing to the S' overhang at the first end of 
ttie first ds nucleic acid molecule. In these embodiments involving non-directional 
linking, the second end of the first ds nucleic acid molecule and the second end of the 
second ds nucleic acid molecule can be either blunt, or include an overhang. 

In embodiments involving a linking a third ds nucleic acid molecule to the second 
IS ds nucleic acid molecule, the methods can be used to directionally or non-directionally 
link the two nucleic acid molecules. The method typically is used to directionally link the 
second ds nucleic acid molecule and the third ds nucleic acid molecule. However, the 
method can be used to non-directionally link the third ds nucleic acid molecule and the 
second ds nucleic acid molecule in the following embodiments: 1) Where a second 
20 nucleotide sequence is present at or near the S' terminus of the second end of the flmd ds 
nucleic acid molecule, fliat is enable of hybridizing to the 5' complementary nucleotide 
sequaice at the second end of the second ds nucleic acid molecule; and 2) Where a 
nucleotide sequence is present at or near the 5* terminus of both the first end and the 
second end of the second ds nucleic acid molecule that is capable of hybridizing to the 
25 5' overhang at the first end of the third ds nucleic acid molecule. In these embodiments 
involving non-directional linking, the second end of the third ds nucleic acid molecule 
and the second end of the second ds nucleic acid molecule can be either blunt, or include 
an overhang. 

The present invention also provides a composition, which includes a first 
30 ds nucleic acid molecule having a first and a second end, wherein the first end has a 
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5* overhang and a topoisomerase covalently bound at the 3' tenninus; and a second 
ds nucleic acid molecule having a first blunt end and a second end, wherein the first blunt 
end has a first 5* nucleotide sequence, which is complementary to the first 5*-overhang, 
and a first 3* nucleotide sequence complementary to the first 5' nucleotide sequence. In 
such a composition, the first 5' nucleotide sequence of the first blimt end of the second 
ds nucleic acid molecule can be hybridized to the first 5* overhang of the first end of the 
first nucleic acid molecule, wherein the first 3' nucleotide sequence of the first blunt end 
of the second ds nucleic acid molecule displaced The first ds nucleic acid molecule in 
such a composition can fiirther have a second 5' overhang at the second end, and the 
second end of the second ds nucleic acid molecule can fiirth^ include a second 
5' nucleotide sequence, which is complementary to the second 5' overhang, and a second 
3' nucleotide sequence complementary to the second S' nucleotide sequence. 

The present invention also provides kits, which contain one or more (e.g., 1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, etc.) reagents usefiil for directionally or non-directionally li'nlring ds 
1 S nucleic acid molecules. In one embodiment, a kit of the invention contains a ds nucleic 
acid molecule having a first end and a second end, wherein the first end contains a first 
5' overhang and a first topoisomerase covalently bound at the 3' terminus, and the second 
end contains a second topoisomerase covalently bound at the 3' tenninus and contains a 
second 5* overhang, a blxmt end» a 3* uridine overhangs or a 3' thymidine overhang, 
20 wherein the first 5' overhang is different from the second 5' overhang. The 

topoisomerases, which can be the same or difierent, also can be a component of the kit. 
The ds nucleic acid molecule in the kit can, but need not be a vector, and can contain one 
or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, etc.) expression control elements, as well as 
instructions for using kit components. 

25 A kit of the invention also can include a plurality of second ds nucleic acid 

molecules, wherein each ds nucleic acid molecule in the plurality has a first blunt end, 
and wherein the first blunt end includes a 5' nucleotide sequmce complementary to the 
first 5* overhang of the first ds nucleic acid molecule. The second ds nucleic acid 
molecules in the plurality can be a plurality of transcriptional regulatory elanents. 
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translational regulatory elements, or a combination thereof, or can encode a plurality of 
peptides such as peptide tags, cell compartmentalization domains, and the like. 

A ds nucleic acid molecule component of a kit can be, for example, a linearized 
vector such as a cloning vector or eTcpression vector. If desired, such a kit can contain a 
5 plurality of ds nucleic acid molecules, each comprising a different expression control 
element or other element such as, but not limited to, a sequence encoding a tag or other 
detectable molecule or a cell compartmentalization domain. The different elements can 
be different types of a particular expression control element, for example, constitutive or 
inducible promoters or tissue specific promoters, or can be different types of elements 

10 -^including, for example, transcriptional and translational expression control elements, 
epitope tags, and the like. Such ds nucleic acid molecules may be topoisomerase- 
activated or can be activated with topoisomerase, and contain 5' overhanging sequences, 
or sequences that become 5* overhanging sequences after topoisomerase activation. In 
addition, the plurality of ds nucleic acid molecules can have 5' overhanging sequences 

15 that are unique to a particular expression control element, or that are common to plurality 
of related expression control elements, for example, to a plurality of different promoter 
elements. The S' overhanging sequences of ds nucleic acid molecules can be designed 
such that one or more expression control elements contained on tiie ds nucldc acid 
molecule can be operatively directionally linked to provide a useful function, for 

20 example, an element comprising a Kozak sequence and an element comprising a 

transition start site can have complemmtary 5* overhangs such that the elements can be 
operatively directionally linked according to a method of the invention. 

The invention further provides kits for linking nucleic acid molecules using 
methods described herein. Thus, kits of the invention may comprise one or more 

25 components for performing methods described hereio. In particular embodiments, kits of 
the invention may comprise one or more component selected from the group consisting of 
instmctions for use of kits components, one or more buffers, one or more nucleic acid 
molecules (e.g., one or more nucleic acid molecules having a 5* overhang, a 3' overiiang, 
a 5' overhang and a 3* overhang, two 3* overhangs, two 5' overhangs, etc.), one or more 

30 topoisomerase, one or more ligase, one or more recombinase, one or more adapter linker 
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for preparing molecules having a 5' overhang and/or a 3* overhang, and/or one or more 
containers in which to perform methods of the invention. 

The following examples are intended to illustrate but not limit the invention. 

EXAMPLE 1 

5 In a preferred embodiment of the present invention, a topoisomerase-charged ds 

nucleic acid molecule is made by first obtaining a commercially available cloning vector. 
One such vector is pUniA^5-His version A, (Invitrogen Coip, Carlsbad, CA), a circular 
supeicoiled vector that contains uniquely designed elements. These elements include a 
BGH polyadenylation sequence to mcrease mRNA stability in eukaryotic hosts, a 

10 T7 transcription teimination region, an R6Ky DNA replication origin and a kanamycin 
resistance gene and promoter for antibiotic resistance selection. Additionally, 
pUniA^ 5-His version A contains a multiple cloning site, which is a synthetic DNA 
sequence encoding a series of restriction endonuclease recognition sites. These sites are 
engmeered for cloning of DNA into a vector at a specific position. Also within Ihe 

1 5 vector's multiple cloning site is a loxP site inserted 5' to the endonuclease recognition 
sites thereby facilitating Cre recombinase-mediated fiision into a variety of other 
expression vectors, (Echo™ Cloning System, Invitrogen Corp., Carlsbad, CA). An 
optional C-tenninal V5 epitope tag is presrat for easy detection of expressed fusion 
proteins using an Anti-V5 Antibody. An optional C-terminus polyhistidine (6xHis) tag is 

20 also present to enable rapid purification and detection of expressed ptx)teins. A bacterial 
ribosomal binding site downstream &om the loxP site makes transcription initiation in 
E, coli possible. Though this combination of elements is specific for pUniA^5-His 
version A cloning vector, many similar cloning and expression vectors are commercially 
' available or can be assembled fix)m sequences and by methods well known in the art. 

25 pUniA^S-His version A is a 2.2kb double stranded plasmid (see Figures 3 and S). 

Construction of a topoisomerase I charged cloning vector fi'om pUniA^5-His 
version A is accomplished by endonuclease digestion of the vector, followed by 
complementajy annealing of synthetic oligonucleotides and site-specific cleavage of the 
heteroduplex by Vaccinia topoisomerase L SacI and EcoRI are two of the many 
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restriction endonuclease sites present within the multiple cloning site of pUniA^5-His 
version A, (See FIG. 3). Digestion of pUniA^5-His version A with the corresponding 
restriction enzymes, Sad and EcoRI will leave cohesive ends on the vector (5*-AGCT- 
3' and 5'-AATT-3*, Figure 6). These enzymes are readily available &om numerous 
vendors including New England Biolabs, (Beverly, MA, Catalogue Nos. R0156S, Sad 
and ROlOl S, EcoRJ). The digested pUniA^5-His version A is easily separated from the 
digested fragmrats using isopropanol precipitation. These and other methods for 

■J 

digesting and isolating DNA are well known to those skilled in the art, (Sambrook et al., 
(1989) Molecular Cloning, A Laboratory Manual Second edition. Cold Spring Harbor 
Laboratory Press; pages 5.28-5.32.) 

The purified, digested vector is then incubated wifli two specific oligonucleotide 
adapters and T4 DNA ligase. The adapters are oligonucleotide duplexes containing ends 
that are compatible with the SacI and EcoRI ends of the vector. One of skill m the art 
will readily appreciate that other adapter oligonucleotides with appropriate sequences can 
be made for other vectors having dififerent restriction sites. Following incubation with 
T4 DNA ligase, the vector containing the ligated adaptors is purified using isopropanol. 

The adapter duplex that results from the annealing of TOPO Dl and TOPO D2 
has a single-stranded EcoRI ovediang at one end and a 12 nucleotide single stranded 
overhang at the other end. 

The first adapter oligonucleotide, (TOPO Dl), has complementation to the EcoRI 
cohesive end, 3 -TTAA-5'. Furthermore, TOPO Dl has an additional 24 bp including the 
topoisomerase consensus pentapyrimidine element 5 -CCCTT located 16 bp upstream of 
the 3' end. The remaining sequence and size of TOPO Dl adapter oligo is variable, and 
can be modified to fit a researcher's particular needs. Li the current embodiment 
5'-AMTGATCCCTlCACCGACATAGTACAG-3' (SEQ ID N0:5) is the full sequence 
of the adapter used. 

The second adapts oligonucleotide, (TOPO D2), must have full complCTdratation 
to TOPO Dl. TOPO D2 complements direcfly 5* of the EcoRI cohesive flap, extending 
the bottom strand of the linearized vector. Additionally, TOPO D2 contains the sequence 
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3 -GTGG, which is the target sequence, and single-stranded overhang after topoisomerase 
cleavage, for directional cloning. In this embodiment, the single stranded overhang was 
chosen to complement the Kozak sequence known to help expression of ORFs in 
eukaryotic cells by increasing the efficiency of ribosome bmding on the mRNA, however, 
sequence and length are highly variable to meet the specific needs of individual users. 
The complete sequence of TOPO D2 is 3'-CTAGGGA AGTGG- S' (SEQ ID N0:6). 

Similar to above, the adapter duplex that results from the annealing of 
oligonucleotides TOPO D4 and TOPO D5 has a single-stranded Sad overhang at one 
end, and a 12 nucleotide single-stranded overhang at the other end. 

The third adapter oligonucleotide, (TOPO D5), has complementation to the Sad 
cohesive end, 3*-TCGA-5'. Similar to TOPO Dl, TOPO D5 has additional bases creating 
a single stranded overhang. The length and sequence can vary based on the needs of the 
user. In the current embodiment TOPO D5*s sequence is 5'-AAGGGCGAGCT-3' (SEQ 
IDN0:7). 

The fourth ad^ter oligonucleotide, (TOPO D4), has full complementation to 
TOPO D5, and complements directly 5' of Ae Sad cohesive flap extending the top strand 
of the linearized vector. TOPO D4 also contains the topoisomerase consensus sequence 
5 -CCCTT. The remaimng sequence and size of TOPO D4 adapter oUgo is variable and 
can be modified to fit a researcher's particular needs. In the current embodunent, the 
sequence of TOPO D4 is 3*-GACATGATACAGTTCCCGC-5* (SEQ ID N0:8), which 
includes an additional 12 bp single stranded overhang. 

These adapter oligonucleotides can be chemically synthesized using any of 
numerous techniques, including the phosphoramadite method (Caruthers et al., Meth, 
Enzvmol. 154:287-313, 1987). This and other methods for the chemical synthesis of 
oUgos are well known to those of ordinary skill in the art. 

Conq)lementary annealing of the purified digested vector and the ad^ter 
oligonucleotides is done by incubation of the DNA in the presence of T4 DNA ligase. 
Typical Kgation reactions are performed by incubation of a cloning vector with suitable 
DNA fi^gmmts in the presence of ligase and an ^propriate reaction buffer. Buffers for 
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ligation reactions should contain ATP to provide energy to for the reaction, as well as, 
reducing reagents like dithiothreitol and pH stabilizers like Tris-HCl. The ratio of 
concentrations for the cloning vector and the DNA fragments are dependent on each 
individual reaction, and formulae for their determination are abundant in the literature, 
(See e.g., Protocols and Applications Guide (1991), Promega Corporation, Madison, WI, 
p.45). T4 ligase will catalyze the formation of a phosphodiester bond between adjacent 
5 -phosphates and 3 -hydrox>d termini during the incubation. Cohesive end ligation can 
generally be accomplished in 30 minutes at 12-15** C, while blunt ©ud ligation requires 
4-16 hours at room temperature, (Ausubel et al., (1992) Second Edition; Short Protocols 
in Molecular Biology, John Wiley & Sons, Inc., New York, NY, pages 3.14-3.37), 
however parameter range varies for each experiment. In the current embodiment, 
purified, digested pUniA^5-His version A and the adapter oligos were incubated in the 
presence of T4 ligase and a suitable bufiBer for sixteen hours at 12.5 **C. The resultmg 
linearized and adapted vector comprises the purified cloning vector attached to the 
adapter oligonucleotides tiirougji base pair complementation and T4 ligase^catalyzed, 
phosphodiester bonds (see Figure 7). 

Efficient modification of the adapted vector with topoisomerase requires the 
addition of an aimealmg oligo to generate double stranded DNA on TOPO Dl's and 
TOPO D4's single stranded overhangs. Vaccinia topoisomerase I initially binds non- 
covalenfly to double stranded DNA. The enzjrae then diffuses along the duplex until 
locating and covalently attaching to the consensus pentapyrimidine sequence 5'-CCCTT, 
forming the topoisomerase adapted complex, (See Shuman et al., U.S. Pat. 
No, 5,766,891). Modification of the adapted vector takes place in the absence of DNA 
ligase to prevent the formation of phosphodiester bonds between the adapted vector and 
the annealing oligo, since phosphodiester bonds in the non-scissile strand will prevent the 
dissociation of the leaving group upon cleavage, (Figures 8 and 9). 

The anneahng oUgonucleotide, (TOPO D3), must have complementation to the 
single stranded DNA overhangs of TOPO Dl and TOPO D4. In tiie current embodiment 
the overhangs botii share the following sequence, 5*-GACATAGTACAG-3' (SEQ ID 
N0:9). Therefore, TOPO D3 has Ihe following sequence. 
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3'-CTGTATCATGTCAAC-5* (SEQ ID NO:10), which comprises fiiU complementation 
to the adapter oligos' single stranded overhang and an additional 3 bp overhang, 
3*-AAC-5'. 

Incubation of the adapted vector with the annealing o&go in the presence of 
5 topoisomerase will create double stranded DNA to which topoisomerase can non- 

covalently bind, (Figure 10). Bound topoisomerase will search the double stranded DNA 
by a facilitated diffusion mechanism, until the 5'-CCCTT recognition motif is located. 
Cleavage of the phosphodiester backbone of the scissile strand 3' of the motif is catalyzed 
via a nuclepphilic attack on the 3' phosphorous atom of the preferred oligonucleotide 

10 cleavage sequence 5'-CCCTTi resulting in covalent attachment of the DNA to the 

enzyme by a 3'-phosphotyrosyl linkage, (See Shuman et al., (1989) Proc. Natl Acad. Sci. 
U.SA, 86, 9793-9796). Cleavage of the scissile strand creates a double stranded leaving 
group comprising the 3' end adapter oligo, downstream fiom flie S'-CCCTT moti^ and the 
annealing oligo TOPO D3. Although the leaving group can religate to tiie topoisomerase- 

IS modified end of the vector via 5' hydroxyl-mediated attack of the phosphotyrosyl linkage, 
this reaction is dis&vored when the leaving group is no longer covalently attached to the 
vector. The addition of T4 polynucleotide kinase and ATP to ttie cleavage/religation 
reaction further shifts fte equilibrium toward the accumulation of trapped topoisomerase 
since the kinase can phosphorylate the 5* hydroxyl of the leaving group to prevent the 

20 rejoining &om takmg place, (Ausubel et al., (1 992) Second Edition; Short Protocols in 
Molecular Biology, John Wiley & Sons, Inc., New York, NY, pp. 3.14-3.30). The 
resxilting linearized vector comprises a blunt end firom the TOPO D4/D3 leaving group 
and a single stranded overhang bearing end fi-om the TOPO D1/D3 leaving group, (Figure 
11). Both of the linearized cloning vector's mis are charged with topoisomerase, 

25 enabling fast, efficient and directional topoisomerase mediated insertion of an acceptor 
molecule. 

Although the above example details the modification of pUniA^5-His version A to 
form the topoisomerase-modified directional cloning vector, a pCTSon of ordinary skill in 
the art will appreciate how to ^ly these methods to any plasmid, cosmid, virus, or oflier 
30 DNA. It should also.be noted that this exanq>le demonstrates a vector containing a 
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5' single stranded overhang comprising the sequence 5'-GGTG-3', however the design of 
adapter duplexes and annealing oligonucleotides would allow one of skill in the art to 
custom design overhangs of any sequence or length at one or both ends of a given vector. 

Specifically, any plasmid, cosmid, virus or other DNA can be modified to possess 
S a single stranded overhang of any convenient sequence and length. These are the basic 
steps: the vector is first subjected to a treatment that is known to linearize the DNA. 
Common procedures include, but are not limited to, restriction digestion and treatment 
with topoisomerase H. Following linearization, a custom single stranded overhang is 
added. In the above example, complementary ohgonucleotides are added to the sticky 
10 ends of a restriction digestion giving the desired single stranded overhang, however single 
stranded overhang forming oligonucleotides can be added by T4 blunt end ligation, as 
well. The single stranded overhang sequence is exposed by a topoisomerase I mediated, 
single strand nicking. In tum, this single stranded overhang can be used to directionally 
insert a PCR product comprisuig one or more complimentary nucleotide sequences. 

15 Likewise, topoisomerase modification can be ^plied to any double-stranded 

plasmid, cosmid, virus or other piece of DNA. Methods for the attachment of 
topoisomCTase I to double stranded DNA are well known in the art, (See Shuman et al., 
U.S. Pat. No. 5,766,891). The strategic placement of topoisomerase on to a piece of 
double stranded DNA is determined by the incorporation of a tc^oisomerase I consensus 

20 sequence, (See Shuman et al., U.S. Pat No. 5,766,891). The topoisomerase I will bind 
the double stranded DNA, nick the scissile strand thus revealing the predetermined 
single-stranded oveihang sequence, and ligate the incoming PCR product in die correct, 
single stranded overhang mediated orientation. 

EXAMPLE2 

25 As an example of the application of the pr^ent invention to another.plasmid, 

pCR® 2.1, (Figures 4 and 12), was modified to create a topoisomerase I adapted vector 
with a custom single stranded sequ^ce. 

The pCR® 2. 1 plasmid is 3.9 kb T/A cloning vector. Within the sequence of this 
vector are many uniquely designed elements. These elanents include an fl origin, a 
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ColEl origin, a kanamycin resistance gene, an ampicillin resistance gene, a iacZ-alpha 
fragment and a multiple cloning sequence located within the LacZ-alpha fragment 
allowing for blue-white selection of recombinant plasmids. The multiple cloning 
sequence, (FIG. 4) of the pCR® 2. 1 plasmid contains; numerous restriction sites, 
including but not limited to, Hindm, Spel and EcoRI; M13 forward and reverse primers 
and a T7 KNA polymerase promoter. 

Construction of the topoisomerase I charged vector possessing a custom single 
stranded sequence consists of endonuclease digestion followed by complementary 
annealing of synthetic oligonucleotides and the site specific cleavage of the heteroduplex 
by Vaccinia topoisomerase 1. Digestion of the pCR® 2.1 plasmid with the restriction 
enzymes Hindin, Spel and EcoRI leaves HindTTT and EcoRI cohesive ends on the vector 
(Figure 13). The dissociated fragment of pCR® 2.1 downstream from the Hindin 
cleavage site is fiirdier cleaved with Spel in order to reduce its size. By reducing the size 
of the fragment, the digested vector is easily purified away from the smaller digested 
pieces by isopropanol precipitation. These enzymes are readily available from numerous 
vendors including New En^and Biolabs, (Beverly, MA, Catalogue Nos.; RO104S, 
Hindm; R0133S, Spel; ROIOIS, EcoRI). Methods for the digestion and the isolation of 
DNA are well known to those skilled in the art, (Sambrook et al., supra, 1989). 

The purified digested vector is incubated with four adi^ter oligonucleotides and 
T4 DNA ligase. These adapter oligonucleotides are designed to have complementation to 
eithCT the Hindm cohesive end, the EcoRI cohesive end, or to each o&er. Following 
incubation with T4 DNA ligase the adapted vector is purified using isopjropanol. 

The first ad^ter oligonucleotide, (TOPO H), has complementation to the HindHI 
cohesive aid, 3 -TCGA-5'. Furthermore, TOPO H has an additional 24 bp including the 
topoisomerase consensus pentapyrimidine element S -CCCTT located 19 bp iq)stream of 
the 3* end. The remaining sequence and size of TOPO H adapter oligo is variable, and 
can be modified to fit a researches particular needs. In the cuirent embodiment 
5'-AGCTCGCCCITATTCCGATAGTG-3' (SEQ ID N0:1 1) is the fiiU sequmce of the 
adapter used. 
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The second adapter oligonucleotide^ (TOPO 16), must have full complementation 
to TOPO H. TOPO 16 complements directly 5' of the Hindm cohesive end, extendmg 
the bottom strand of the linearized vector. Additionally, TOPO 16 contains the sequence 
3 -TAAG, which is the chosen single stranded sequence for directional cloning. The 
5 complete sequence of TOPO 16 is 3*-GCGGGAATAAG-5', (SEQ ID N0:12), 

The third adapter oligonucleotide, (TOPO 1), has complementation to the EcoRI 
cohesive end, 3*-TTAA-5\ Similar to TOPO H, TOPO 1 has additional bases containing 
the topoisomerase I consensus sequence CCCTT located 12 bp upstream of the 3' end. 
The length and sequence of TOPO 1 can vary based on the needs of the user. In the 
10 current embodiment TOPO I's sequence is 5*-AATTCGCCCTTATTCCGATAGTG- 
3'(SEQIDNO:13). 

The fourth adapter oligonucleotide, (TOPO 2), has fiiU complementation to 
TOPO 1, and complements directly 5' of the EcoRI cohesive end extending the top strand 
of the linearized vector. In the current embodiment, the sequence of TOPO 2 is 
15 3'-GCGGGAA-5\ 

Complemmtary annealing of the purified digested vector and the ad25>ter 
oligonucleotides is done by incubation of the DNA in the presence of T4 DNA ligase. 
T4 Ligase will catalyze the formation of a phosphodiester bond between adjacent 
5 -phosphates and 3'-hydroxyl termini during the incubation. In the current embodiment, 

20 purified, digested pCR® 2. 1 and the adapter oligos were incubated in the presence of 
T4 Ugase and a suitable buffer for sixteen hours at 12.5*^ C. The resulting linearized and 
adapted vector comprises the purified cloning vector attached to the adapter 
oligonucleotides througji base pair compl^entation and T4 ligase-catalyzed, 
phosphodiester bonds (FIG* 13). Ligation techniques are abundant in the literature, (see 

25 Ausubel et al., (1992) Second Edition; Short Protocols in Molecular Biology, John Wiley 
& Sons, Inc., New York, NY, pp. 3.14-3.37). 

Charging of flie adqjted vector with topoisomerase requires the addition of 
annealing oligonucleotides to goierate double stranded DNA on TOPO ITs and TOPO Vs 
single stranded overhangs. Charging of the adapted vector takes place in the absence of 
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DNA ligase to prevent the formation of phosphodiester bonds between the adapted vector 
and the annealing oligo, since phosphodiester bonds in the non-scissile strand will prevent 
the dissociation of the leaving group upon cleavage (see Figure 9). 

The annealing oligonucleotide, (TOPO 17), must have complementation to the 
5 single stranded DNA overhang of TOPO H, In the current embodiment the overhang has 
the following sequence, 5'-CGATAGTG-3'. Therefore, TOPO 17 has the following 
sequence, 3 -GCTATCAC-5', which comprises full complementation to the single 
stranded overhang of the ad^ter oUgonucleotides. . 

The annealing oligonucleotide, (TOPO 3), must have complementation to the 
10 singje stranded DNA overhang of TOPO 1 . In the current embodiment the overhang has 
the foUowing sequence, 3'-GTGATAGCCTTA-5' (SEQ ID NO: 14). Therefore, TOPO 3 
has the foUowmg sequence, 5'-CAACACTATCGGAAT-3* (SEQ ID N0:15), which 
comprises full complementation to the adapter oligonucleotide's single stranded overhang 
and an additional 3 bp overhang, 5 -CAA-3\ 

1 5 hicubation of the ad£^ted vector wift die annealing oUgo in the presence of 

topoisomerase will create double stranded DNA to which topoisomerase can non- 
covalently bind, (FIG. 14). Boxmd topoisomerase will search the double stranded DNA by 
a facihtated diffusion mechanism, until the 5 -CCCTT recognition motif is located. 
Cleavage of the phosphodiestCT backbone of the scissile strand 3* of the motif will result 

20 in the covalent attachment of the DNA to the enzyme by a 3 -phosphotyrosyl linkage, 
(Shuman et al., Proc. Natl. Acad> Sci. U.S.A. 86:9793-9796, 1989). Cleavage of the 
scissile strand creates a double stranded leaving groi^ comprising the 3' end the ad^ter 
oligos, downstream from the 5 -CCCTT moti^ and the complraientary annealing 
oligonucleotide. The leaving group can religate to the topoisomerase adapted vector 

25 through its 5' hydroxyPs attack of the phosphotyrosyl linkage, also catalyzed by 

topoisomerase. Addition of T4 polynucleotide kinase to the equilibrium reaction prevents 
the back reaction via the kinase-mediated phosphorylation of the leaving groin's 
5* hydroxyl, (Ausubel et aL, (1992) Second Edition; Short Protocols in Molecular 
Biology, John Wiley & Sons, hic. New York, NY, pp. 3.14-3.30). The resulting 

30 linearized vector comprises a blunt end from the TOPO 1/3 leaving group and a single 
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stranded sequence end from the TOPO H/17 leaving group, (Figure 15). Both of the 
linearized cloning vector's ends are charged with topoisomerase, enabling fast, efficient 
and directional topoisomerase mediated insertion of an acceptor molecule. 

Directional cloning according to the invention. 

5 This invention also provides a method for directional cloning of DNA. In ftie 

following example, the topoisomerase-ctorged ds nucleic acid vector according to the 
present invention constructed from pUniA^5-His version A was used for the directional 
insertion of ORFs from the GeneStonn™ Expression Ready Clones, (tovitrogen Corp., 
Carlsbad, CA). The modified pUni vector was selected for Hxc cloning of these ORFs 

10 because the added target sequence, which becomes a single strand overhang upon 
topoisomerase cleavage of flie vector, has homology to the Kozak sequence known to 
enhance ORF expression. Note, however, that, as before, any plasmid, cosmid, virus or 
other DNA could be modified to possess the necessary single stranded sequence. 
Likewise, any DNA fragment could be modified to possess a homologous sequence to 

15 any single stranded overhang of a vector. As a point of interest, the sequence of the 

single stranded overhang can effect directional cloning efficiencies. For example, single 
stranded overhangs with low GC content will have lower annealing stability, also single 
stranded overhangs that have high complementation to both ends of a DNA fragment to 
be cloned will loose the capability to direct these DNA inserts. Thus the sequence of a 

20 single stranded overhang should be carefully designed to avoid these and similar 
problems. 

EXAMPLE 3 

The present invention is particularly usefrd in the directional insertion of PGR 
products into vectors constructed according to flie present inventioru In the PCR 
25 amplification of the desired insert, the PCR primers are designed so as to conq>lement 
identified sequences of the insert(s) that are to be directionally cloned into the 
topoisomerase-charged ds nucleic acid vector of the present invention. The primer 
designed to bind upstream of the DNA's coding strand is modified with an additional 
complementary nucleotide sequence on its 5* end. The resulting PCR product will 
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possess a complementary sequence allowing single stranded overhang mediated 
directional insertion into the topoisomerase-charged ds nucleic acid cloning vector of the 
present invention and subsequent expression of the product 

One embodiment comprises introducing to a donor duplex DNA substrate a single 
5 stranded overhang site by PCR amplifying the donor duplex DNA molecule wifli the 
5' oligonucleotide primer containing the single stranded overhang. PCR amplification of 
a region of DNA is achieved by designing oUgonucleotide primers that complemmt a 
known area outside of the desired region. Li a preferred embodiment the primer that has 
homology to the coding strand of the double stranded region of DNA will possess an 
10 additional sequence of nucleotides complementary to the single strand overhang of the 
topoisomerase-charged ds nucleic acid cloning vector of the preset invention. 

Using the present invention in a high througlq)ut format, we selected eighty-two 
known ORFs from the GeneStorm™ expression system, (Invitrogen Corporation, 
Carlsbad, CA) for directional cloning into the topoisomerase-charged ds nucleic acid 

15 vector of the present invention, however, any sequence of DNA can be selected as desired 
by individual users. For each of these ORFs, primers are designed with homology to the 
coding and the non-coding strands. To clone PCR products in a directional fashion into 
the modified pUniA^5-His version A topoisomeiase-charged ds nucleic acid vector of the 
present invention as described in example 1, one primer of a given pair was modified to 

20 contain primer of a given pair was modified to contain the nucleotide sequence 

complementary to the single strand overhang contained within the vector. In the current 
example, the coding prima: contained the added sequence S'-CACC-3', which 
complements ttie 'smgle stranded overhang', 3 -GTGG-5', of the topoisomerase-charged 
ds nucleic acid cloning vector of tiie present inventioa PCR amplification of the above 

25 ORFs with their respective primers will produce double stranded DNA firagments, which 
possess the single strand overhang at flieir 5' end, (Figure 16). We used pfu polymerase in 
our PCR amplification, but it is well-known that PCR reactions can be performed with 
either a non-thermophiUc polymerase such as pJU or with a thermophilic polymerase like 
Tag followed by a blunting step to remove the hon-tacaplate nucleotide these enzymes 

30 leave at the end of PCR products. 
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In Ihe present example, 0.1 microgram of each primer was combined with 
0.05 microgram of DNA containing an ORF in a PGR reaction mix totaling 50 microliters 
total volume. Besides the primers and vector, the reaction mix also contamed water, PGR 
buffer salts, lOmM dNTPs and 1.25 units ofpju polymerase. Thermal cycling 
5 temperatures were as follows; an initial 94° C denaturation; followed by 25 repetitions of 
94°C denaturation, 55**C primer annealing, and 72° C elongation, each at one minute; and 
CTded with a 72°C, fifteen minute elongation. However fliese parameters will vary with 
each DNA fragment to be amplified. PGR amplification techniques are well known to 
those skilled in the art, (Ausubel et al., (1992) Second Edition; Short Protocols in 
10 Molecular Biology, John Wiley & Sons, Inc., New York, NY, pp. 15.3-15.4), as are 
techniques for the conv^sion of 3' oveifaangs to blunt end temtiini, (Protocols and 
Applications Guide^ Promega Gbip.; Madison WI, pp. 43-44, 1989), 

Incubation of the PGR amplified donor duplex DNA containing the 
complementary nucleotide sequence with the modified pUniA^5-His version A 

1 5 topoisomerase-charged ds nucleic acid vector of the present invention results in the 
directional cloning of the donor DNA. For example, the eighty-two ORFs fix)m the 
GeneStoim™ clone collection (Invitrogen Gorporation, Garlsbad, GA) were ampUfied 
using adq>ted primers containing a coniplementary nucleotide sequence. Amplification 
of the 82 GeneStorm™ ORFs with flie described modified primer pairs resulted in PGR 

20 products that had the complementary nucleotide sequence at their 5' end. This ORF PGR 
product is combined with 10 ng of topoisomerase-charged ds nucleic acid cloning vector 
of the present mvention in either sterile water or a salt solution. The reaction is mixtd 
gently and incubated for 5 minutes at room temperature (22-23*'G). After five nainutes, 
we placed the reaction on ice then proceeded to the One Shot® Ghemical Transfomiation 

25 or Electroporation, (Invitrogen Gorporation, Garlsbad, GA, Gatalogue # G4040-10 and 
G4040-50, respectively), {Invitrogen TOPO Cloning Protocol. Invitrogen Gorp.). 
Topoisomerase had joined the adjacent strands of the vector and the product by catalyzing 
a rejoining reaction (Figure 17). DNA ftagmmts constructed with the complemoitary 
nucleotide sequence at then: 5* ends were thus correctly inserted into topoisomerase- 

30 charged ds nucleic acid cloning vectors of the presmt invention with a high efficiency. 
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Directional insertion of DNA fragments containing 5* sequences complementary 
into ds nucleic acid cloning vectors according to certain embodiments of the present 
invention occurs with greater than 90% efficiency as shown by sequencing multiple 
colonies of transformed host cells. In the current example, the topoisomerase-charged ds 
5 nucleic acid cloning vectors of the present invention containing the GeneStonn™ ORFs 
were incubated with transfomiation competent E. coli host cells. M seventy-four of the 
transformation reactions, the directional cloning of the ORFs into the topoisomerase- 
charged ds nucleic acid cloning vector of the present invention occurred in at least seven 
of the eight colonies picked, and fifty-nine of these cloning reactions were directional in 
10 all eigiht colonies picked. The overall directional cloning score was 609 of 656, thus, 
directional insertion was present in over 93% of the clones picked (see Table 1 below). 

EXAMPLE 4 

In a similar example, using the above described modified pCR®2.1 
topoisomerase-charged ds nucleic acid vector of the present invention, a PCR-generated 

15 ORF encoding Green Fluorescent Protein (GFP) was directionally cloned in frame with 
the lacZ a fragment present in the vector {see FIG. 4). The primers used to amplify the 
GFP gene contained the requisite complementary nucleotide sequence 5 -ATTC-3', and 
the known sequence for translation initiating methionine, 5 -ATG-3'. Using the necessary 
cloning stq)s noted above, the PGR amplified GFP was inserted into the vector and 

20 transformed cells were grown on solid Agar plates. Glowing colonies represented a 
correctly inserted PGR prpduct (see Table 2 below). 

These data represent a substantial improvement over the current state of the art in 
cloning, and fiirtfaermore present an invention in cloning that is highly compatible with 
higli flirougjiput techniques. Given directional cloning efficiencies greater that 90%, a 
25 user need only screen two colonies for each cloned DNA firagment Thus, on a 96-well 
plate, forty-eight separate clones can be screened for directional insertion, 400% more 
than current cloning techniques. Use of this mvmtion will streamline many high 
througJq>ut gene ^ression operations, and allow them to run at fiiaction of their currmt 
costs. 



30 
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. Table 1. Directional Cloning of ORFs using a topoisomerase-charged ds nucleic acid 



cloning vector of the present invention 

Positive colonies. dPCR reactions. Glones tested 
8/8 . . 59 

7/8 15 

6/8 2 

5/8 1 

4/8 . 3 

3/8 2 

5 Table 2. In frame and directional insertion of GFP into modified pCR2.1 
topoisomerase-charged ds nucleic acid cloning vector of the present invention 

PGR product's 5' sequence Percentagie of correct Total white colonies 

inserts (contain arecbmbinant 

plasmid). 

5*-ATTCATG-3' homologous 86% 457 

5*-CAAGATG-3* non- 35% 118 
homologous 

5 -ATTCGGATG-3' frame shift . 0% . 268 

VECTOR ONLY 0% 31 



Although the mvention has been described with reference to the above exanq)les, it 
1 0 will be und^:stood that modifications and variations are encompassed wilfain the spirit and 
scope of the invention. Accoidingly, Ihe invration is limited only by flie following claims. 
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What is claimed is: 

1 . A method for generating a directionally linked recombinant nucleic acid 
molecule, the method comprising contacting: 

a) a topoisomerase-charged first double stranded (ds) nucleic acid 
molecule, comprising a first topoisomerase covalently bound at or near a first end, 
and a second topoisomerase covalently bound at or near a second end, said first 
end further comprising a first 5* overhang, and said second end fiirther comprising 
a blunt end, a 3* thymidine overhang, or a second 5' overhang; and 

b) a second ds nucleic acid molecule, comprising a first blunt end and a 
second end, wherein the first blunt end comprises at its 5' terminus, a nucleotide 
sequence complementary to the first S' oveifaang, 

under conditions such that Ihe nucleotide sequence complementary to the first 
5' overhang can selectively hybridize to the first 5' overhang, 

whereby the first topoisomerase can covalently link the 3' terminus of the first end 
of the first ds nucleic acid molecule with the 5* terminus of the first end of the second ds 
nucleic acid molecule, and 

whereby the second topoisomerase can covalently link the 3' terminus of the 
second end of the first nucleic acid molecule to the 5' terminus of the second end of the 
second ds nucleic acid molecule, thereby generating a directionally linked nucleic acid 
molecule. 

2. The method of claim 1, wherein the second end of the first ds nucleic acid 
molecule comprises a blunt end, and tiie second end, of the second ds nucleic acid 
molecule comprises a blunt end 

3. The method of claim 1, wherein tiie second end of the topoisomerase-charged 
first ds nucleic acid molecule comprises a 3' thymidine overhang, and the second end of 
the second ds nucleic acid molecule comprises a 3' adenosine overhang. 



wo 02/16594 



PCTAJSpl/26294 



82 

4. The method of claim 1, wherein the topoisomerase-charged, first ds nucleic 
acid molecule comprises a second 5' overhang at the second end, and the second 

ds nucleic acid comprising at the second end, a nucleotide sequence complementary to the 
second 5* overhang. 

5 

5. The method of claim 1, wh^ein the first ds nucleic acid molecule is a vector. 

6. The method of S, wherein the topoisomerase-charged first ds nucleic acid 
molecule is a cloning vector. 

10 

7. The method of 6, wherein the topoisomerase-charged first ds nucleic acid 
molecule is an expression vector. 

8. The method of claim 1, fiirther comprismg introducing the directionally-linked 
1 5 recombinant nucleic acid molecule into a cell. 

9. The method of claim 8, wherein the cell is a eukaryotic cell. 

10. The method of claim 9, wherein the cell is a manomalian cell. 

20 

1 1 . A cell produced by the methods of claim 8. 

12. A transgenic non-human organism generated &om the cell of claim 1 1 . 

25 13. Themethodofclaim8, wfa«:einthecellisabact^um. 

14. The method of claim 1, wherein the second ds nucleic acid molecule 
comprises an anq)lification product. 

30 IS. The method of claim 1, wherein flie topoisomerase is a type IB 

topoisomerase. 
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16. The method of claim 1, wherein the second ds nucleic acid molecule 
comprises one of a plurality of second ds nucleic acid molecules. 

17. The method of claim 16, wherein second ds nucleic acid molecules in the 
5 plurality are difiBarent from each other. 

18. The method of claim 16, wherein said plurality of second ds nucleotide 
molecules comprises a cDNA library. 

10 19, The method of claim 1 6, wherein said plurality of second ds nucleotide 

molecules comprises a combinatorial library. 

20. A recombinant nucleic acid molecule produced by the method of claim 1 . 

15 2 1 . A method for generating a directionally linked recombinant nucleic acid 

molecule, the method comprising contacting: 

a) a first precursor double stranded (ds) nucleic acid molecule comprising 
a first end, which comprises at the 5' terminus, a first 5' target sequence, 

and at the 3* terminus, a topoisomerase recognition site; and 
20 a second end which comprises at the 3' terminus, a topoisomerase 

recognition site; 

b) a second ds nucleic add molecule comprising a first bltmt end and a 
second end, wherein the first blunt end conqnises at the 5' tenninus a nucleotide 
sequence complraoientary to the 5' target sequence; and 

25 c) a topoisomerase specific for the topoisomaase recognition site, 

under conditions that allow topoisomerase activity, and that allow hybridization of 
the first 5' target sequence and the nucleotide sequence complementary to the target 
sequence, thereby generating a directionally linked recombinant nucleic acid molecule. 



30 



22. The method of claim 21, wherein the second end of fiie first precursor ds 
nucleic acid becomes a blunt end upon cleavage by die topoisomerase, and the second 
end of the second ds nucleic acid molecule is a blunt end. 
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23. The method of claim 21, wherein the second end of the first precursor ds 
nucleic acid molecule comprises a 3* thymidine extension upon cleavage by the 
topoisomerase, and the second end of the second ds nucleic acid molecule comprises a 

S 3' adenosine overhang. 

24. The method of claim 21, wherein the first precursor ds nucleic acid molecule 
comprises a second S' target sequence located at the second ^d and the second ds nucleic 
acid molecule comprises at the second end, a nucleic acid sequence complementary to the 

10 second S* target sequence. 

25. The mettiod of claim 24, wherein the first precursor ds nucleic add molecule 
is a vector. 

15 26. The method of claim 25, wherein the vector is an expression vector. 

27. The method of claim 21, further comprising introducing the directionally- 
linked recombinant nucleic acid molecule into a cell. 

20 28. The method of claim 2 1 , wherein the first precursor ds nucleic acid molecule 

comprises an expression control el^ent and tiie second ds nucleic acid molecule 
comprises an open reading firame, wherein in the directionally linked recombinant nucleic 
acid molecule, the e}q)ression control element is operatively linked to the open reading 
firame. 

25 

29. The mettiod of claim 21, wherein the second ds nucleic acid molecule 
con:q)rises an amplification product 

30. The method of claim 21, wherein flie topoisomerase is a type IB 
30 topoisomerase. 
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31. The method of claim 21, wherein the second ds nucleic acid molecule 
comprises one of a plurality of second ds nucleic acid molecules. 

32. The method of claim 3 1 , wherein said plurality of second ds nucleotide 
S molecules comprises a cDNA library. 

33. A recombinant nucleic acid molecule produced by the method of claim 21. 

34. A method for generating a directionally linked recombinant nucleic acid 
10 molecule, the method comprising contacting: 

a) a topoisomerase-charged first double stranded (ds) nucleic acid 
molecule, comprising a first topoisomerase covalently bound to tihe 3' terminus of 
a first end of the ds nucleic acid molecule, said first end further comprising a first 
5* overhang; and 

15 b) a second ds nucleic acid molecule, comprising a first blunt end and a 

second end, wherein the first blimt end comprises a 5' nucleotide sequence 
complementary to the first 5' overhang, 

under conditions such that the 5* nucleotide sequence of the first blunt end can 
selectively hybridize to the first 5* overhang, 
20 whereby the first topoisomerase can coval^tly link the 3* terminus of the first end 

of the first ds nucleic acid molecule with the S* terminus of the first end of the second ds 
nucleic acid molecule. 
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35. The method of claun 34, further comprismg contacting the topoisomerase- 
charged first ds nucleic acid molecule and the second ds nucleic acid molecule with a 
third ds nucleic acid molecule, 

wherem a first end of the third nucleic ds acid molecule comprises a 5' overhang 
5 and a second topoisomerase covalently bound at the 3' terminus, and 

wherein die second ds nucleic acid molecule fiirther comprises a second blunt end, 
which comprises a 5* nucleotide sequence coirg)lementary to the second 5* oveifaang, and 

wherein the contacting is performed under conditions such that the 5' nucleotide 
sequence of the second blunt end of the second ds nucleic acid can selectively Hybridize 
10 to the 5' overhang of tiie first end of the third ds nucleic acid molecule, 

whereby the second topoisomerase can covalently link the 3' terminus of the first 
end of the third ds nucleic acid molecule with the 5' terminus of the second blunt end of 
the second ds nucleic acid molecule. 

15 36. The method of claim 35, wherein the first ds nucleic acid molecule is 

directionally linked to the second ds nucleic acid molecule and, thereafter, the third ds 
nucleic acid molecule is directionally linked to the second ds nucleic acid molecule. 

37. The method of claim 34, wherein the first ds nucleic acid molecule is 
20 operatively linked to the second ds nucleic acid molecule. 

38. The method of claim 34, wherein the first ds nucleic acid comprises an 
expression control element and the second ds nucleic acid comprises an open reading 
fi'ame. 



25 
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39. The method of claim 35, wherein the first ds nucleic acid molecule comprises 
an expression control element, the second ds nucleic acid molecule comprises an open 
reading frame, and the third ds nucleic acid molecule encodes a peptide, 

wherein, in the directionally linked recombinant nucleic acid molecule, the 
5 expression control element is operatively linked to the open reading frame, and Ihe 
second ds nucleic acid molecule is operatively linked to the third ds nucleic acid 
molecule, and 

wherein the second ds nucleic acid molecule is operatively linked to the third ds 
nucleic acid molecule encode a frision protein conxprising the open reading frame and the 
10 peptide. 

40. The metiiod of claim 39, wherein the peptide comprises a tag. 

41. An isolated double stranded (ds) nucleic acid molecule, comprising a first 
15 topoisomerase covalently bound at a 3* terminus of a first end, and a second 

topoisomerase covalently bound at a 3' terminus of a second end, said first end further 
comprising a first 5' overhang and said second end further comprising a blunt end, a 
3' thymidine overhang, or a second 5* ov^hang, wherein said first 5* overhang is diBFermt 
&om said second 5' overhang. 

20 

42. The ds nucleic acid molecule of claim 41, wherein the second end comprises 
a blunt end. 

43. The ds nucleic acid molecule of claim 41, wherein the second end comprises 
25 a single 3' thymidine overhang. 

44. The ds nucleic acid molecule of claim 41, wherein tiie second end comprises 
a second 5' ove±ang. 

30 45. The ds nucleic acid molecule of claim 41, wherein the first 5* overhang 

comprise the nucleotide sequence 5 -GGTG-S'. 
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46. The ds nucleic acid molecule of claiin 41, wherein the ds nucleic acid 
molecule is a vector. 

47. The ds nucleic acid molecule of claim 46, wherein the vector further 
S comprises a recombinase site. 

48. The ds nucleic acid molecule of claim 47, wherein the vector furth^ 
comprises a lox site. 

10 49. The ds nucleic acid molecule ofclaim 46, wherein the ds nucleic acid 

molecule is a cloning vector. 

50. The ds nucleic acid molecule ofclaim 49, wherein the ds nucleic acid 
molecule is an expression vector. 

15 

5 1 . The ds nucleic acid molecule of claim 48, wherein the first end and the 
second end are adjacent sequences of a nucleotide sequence encoding a selectable marker. 

52. The ds nucleic acid molecule ofclaim 46, wherein the vector is a 
20 pUniA^5-His version A vector (SEQ ID N0:16). 

53. The ds nucleic acid molecule ofclaim 52, wherein the 5' overhang of the first 
end comprises 5'-GGTG-3', and wherein flie second end is a blunt end, which comprises a 
topoisomerase recognition site at the 3' terminus. 

25 

54. The ds nucleic acid molecule ofclaim 46, wherem the vector is a pCR®2.1 

vector. 

55. The ds nucleic acid molecule ofclaim 54, wherein the 5* oveiiiang of the first 
30 end comprises 5 -GAAT-3*, and wherein the second end is a blunt end, which comprises a 

topoisomerase recognition site at the 3* teraiinus. 
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56. A composition, comprising: 

a) a first ds nucleic acid molecule comprising a first end and a second end, 
wherein the first end comprises a 5* overhang and a topoisomerase covalently 
bound at the 3' terminus, and 

b) a second ds nucleic acid molecule comprising a first blunt end and a 
second end, wherein the first blunt end comprises a first S' nucleotide sequence, 
which is complementary to the first S*-overhang, and a first 3' nucleotide sequence 
complementary to the first 5* nucleotide sequence. 

57. The conq>osition of claim 56, wherein the firat 5* nucleotide sequence of the 
first blunt end of the second ds nucleic acid molecule is hybridized to the fiirst 5' overhang 
of the first end of the first nucleic acid molecule, and the first 3' nucleotide sequence of 
the first blunt end of the second ds nucleic acid molecule is displaced. 

58. The composition of claim 56, wherein the first ds nucleic acid molecule 
further comprises a second 5' overhang at the second end, 

wherein the second end of the second ds nucleic acid molecule further comprises a 
second 5' nucleotide sequence, which is complementary to the second 5* oveihang, and a 
second 3' nucleotide sequence complementary to the second 5' nucleotide sequence. 

59. A kit containing the ds nucleic acid molecule of claim 1 . 

60. The kit of claim 59, wherein the nucleic acid molecule is a vector. 



61. The kit of claim 59, further conq>rising an expression control element. 
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62. A kit, comprising 

a) a first double stranded (ds) nucleic acid molecule, which comprises a 
first topoisomerase covalently bound at a 3* terminus of a first end, and a second 
topoisomerase covalently bound at a 3* terminus of a second end, 

5 said first end fiirther comprising a first 5' oveifaang and said second end 

further comprising a blunt end, a 3' thymidine overhang, or a second 5' overhang, 
wherein said first 5' overhang is different &om said second 5' overhang; and 

b) a plurality of second ds nucleic acid molecules, wherein each ds nucleic 
acid molecule in the plurality conq)rises a first blunt end, and wherein the first 

10 blunt end comprises a S' nucleotide sequence complem^tary to the first 

5' overhang of tiie first ds nucleic acid molecule. 

63. Thekit of claim 62, wherein the second ds nucleic acid molecules in the 
plurality comprise transcriptional regulatory elements, translational regulatory elements, 

IS or a combination thereof. 

64. The kit of claim 62, wherein the second ds nucleic acid molecules in the 
pluraUty comprise nucleotide sequences encoding a peptide. 
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MULTIPLE CLONING SITE 



I 



Ural Forwani prinwig ate 
351 GAGCTTAGTA CGTACTATCA ACAGGTTGAA CTGCTGATCA ACAGATCCTC 



Hpnl 



/oxP site 



401 TACGCGGCCG CGCJTACC ATA ACT TCG TAT AGC ATA CAT TAT ACQ 



RBS 



Agel 



445 AAG TTA TCG GAGGAAT 



AT TGGCrrCGAGG AATTCAC 



AATTCACCGG TGCCGTGTGG 



BamHI Apal Aatlt Stu\ Pvul Sacl ^ 

491 GCGGATCCG6 GOCCGACGTC AGGCC d^AT ' CGG[ aG CTC GGT AAG CCT 

Gly hys Pro 

V5 epitope 



538 ATC CCT AAC CCT CTC CTC GGT CTC gat TCT AGC CAT CAT 
He Pro Asn Pro Leu Leu Gly Leu Asp Ser Ser His His 



577 



GxHistag 



Urni Reverse praimig site 



CAC CAT CAC CAT TGA AGCTCGCTA TCA6CCTCGA CTGT6CCTTC 
His His His His *** 



621 tagttgccag ccatctgttg tttgcccctc ccccgtgcct 




pUnl/V5-His A,B,C 

2.3 kb 



FIGURE 3 
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MULTIPLE CLONING SITE 



(acZaATG 
M13 Reverse Primer \ 



CAG GAA ACA GCT M£ a3 
GTC CTT TGT CGA TAG TGl 



I 



1^1 



Sacl 0aFTHt 
I I 



C ATG ATT ACG CCA AGC TTG GTA GCG AGC TC3G GAT OZA CTA 
GTAC TAA TQC OSf TCG AAC GAT GGC TOG AGC CTA OST GAT 



BsfXI EcoRI 

GTA ACG QCC GDC AGT GTG CTG GAA TTC GGC TT| 
CAT TGC CGG CX3G TCA CAC GAG CTT AAG COG 



I 

GOC GAA TTC TGC 
TT CGG CTT AAG ACG 



Aval 

PlaeRZI 

£coRV esOCI NofI XTiol NsiXXbal Apa\ 

AGA TkT CGA TCA CAC TGG OOG COG GTC GAG CAT GCA^TCT AGA GGG COC AAT TOG 
TCr ATA GGT ACT GTG AOC GOC GGC GAG CTC GTA OGT AGA TCT OCC GGG TTA AGC 



COC TAT 
GGG ATA 
? 



T7 Promoter 



V 



AGT GAG TOG TAT T/>|CAAT TCA CTG GOC GTC GTT TTA 
TCA CTC AGC ATA ATjGTTA AGT GAC CGG GAG GAA AAT 



M13 Forward (-20) Primer 



grr g pa 



Mt3FbiwaidC-40) Primer 
OGT GAC TOG GAA AACi 
A GCA CTG AOC CTT TTGl 




FiGURE4 
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■ Sequence of pUniA^S-His version A ' 

AATTXXrATGTCAGCCGTTAAGTXmtXri^^ 

AGTGaJTTACATCCXnXKKJTTGTIXnXXrACAAC^^ 

TrrTrrrTCrrATAAAACITAAAACCTrAGAGGCT 

AAACATGAGAGCTTAGTACGTGAAACATGAGAGClTAGTACXJTrAGC^ 
ATGAGGGTTTAGTKXjTTAAACATGAGAGCrrAGTACGTTAAACATGAGAGC^ 
GAGCrrAGTACGTACTATCAACAGGTTGAACTCCraATCAAC^ 
ACnTCXjTATAGCATACATTATACGAAGTTATCGGAGGAAp:gG^ 

CGATTCr^GfeAfcATCACCATCACCATroAA 
CCATXTIGTTGTnXXXXXn'CCCCCGTGO^ 

AATAAAATCAGGAAATIXKrATCGCATTGTCTCAGTAGGTGTCATTCT 

GGACAGCAAGGGGGAGGATTGGGAAGACAAtAGCAGGCATGCTGGGGAr^ 

ACAAAGOCXXjAAAGGAAGCTGAOTTOGCTGCTXK^ 

(XTCTAAAaKKjTCTTGAGGGGTTI^^ 

AGAACnXXAGCATGAGATXXCXIXKXKJrGGAGGATCAlC^ 

CCAACCTITCATAGAAGGCGGCGgiXK3AATCX5A^ 

CATriOjAAaXX:AGAGTCCCGCrc 

TXXKKjAGCGGCGATACCGTAy^GCACX3AGGAAGCGGTCAC^^ 

TCACGGGTAGCCAACGCTATGrcCTCAT^ 

AAAAGCXXjCCATTnxXTACCATCATATTtl^^ 

GTCXKjGCATGCGCGCCTTOAGCCT^^ 

TCATCCTCATCGACAAGACCGQCTT(XATCfeAGTACGT^^ 

GAATGGGCAGGTAGCXXKjATCaAGCGTATGCAGCCGCC^ 

GCAGGAGCAAGGTGAGATGACAGGAGATXXnXKXCCGGCACTI03<^ 

GCITCAGTGACAATOTCGAGCACAGCTXKrOCAAGGAAta^^ 

Ganx:GTCCTCCAGTrCATTCAGGGCAC(:^ 

CroACAGCaK;AACACGG<XKK:ATCAGAGCAG<XGATn7^^ 

CTCCACCCAAGCGGiXGGAGAACCIXKXfTGCAATC^ 

GTCTXHTGATCAGATCTTGATCCCCTGCGCCATCAGATCC^^ 

GCAGGGCTTXXCAACCTTACCAGAGGGCGCCCCAGCT^^ 

GCCCAGTCTAGCTATCGCCATCn'AAGCOCACTGCAAGCT 

GTtXlAG ATAGCCCAGTAGCTCACATTCATXXXKXXnx: AGCAC^ 

CC GCTItXnTTAGCAGCCCnXKXKXXTO 
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Add EoqRI and Sad 
digestion eozyznes 



f 5 y-G 

^ 3'-CrTAA 

resulting cohesive end 
post EcoRI digest 



03' 
TCGAC-S* 

Resulting 
ocAesive end post 
SacI digest 
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5 ^XXXXXXXXXXXXXXXXXCCCTT XXXXXXXX-3* 
3'-xxxxxxxxxxxxxxxxxGGGAA xxxxxxxx-5' 



ADD TOPOISOMERASE 



Qropo) 

^ — ^Lea-ving group 

5'-XXXXXXXXXXX»DDDDCCCCTT^ XXXXXXXX-3' 

3'-xxxxxxxxxxxxxxxxxGGGAA xxxxxxxx-5' 

© 
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5'-XXXXXXXXXXXXXXXXXCCCrT XXXXXXXX-3' 
3"-xxxxxxxxxxxxxxxxxGGGAA xxxxxxxx-5' 



ADD TOPOISOMERASE 



r 



(topo) 



/ 

5'-xxxxxxxxxxxxxxxxxcxxrrT 

3'-xxxxxxxxxxxxxxxxxGGGAA 



I Dissoaated leaving group 



XXXXXXXX-3' 
xxxxxxxx-5' 
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Sequence of pCR 2.1 

AGCGCCCAATACXK:AAACXXXXTXriXXCC^ 

TCXXXjACIXK/AAAGCGGGCAGTGAGCGCAACGCA^ 

GCTTTACACITrATGCTrcCGGCTCX^ 

ACAGCTATGACCATGATTACGCXrAAGCnXKn'ACX^ 

TGGAATTXXKKnTAAG€XX5AATrCltK:AGATATCCATC 

GGCOCAATraXXXTATAGTGAGTCGTATTACAATrcAC^^ 

AAACXXntKKXnTACCX^AACTTAAraKX:^^ 

GAGG<XXXK:ACXX5ATX:XXXXrmXCAACAGTTC^^ 

GCXK:ATTAAGCGCXyGCGG<yiXnXK;TG<7^ 

CGCTCCTTTCGCITTCITCXXT^ 

GGGCTXXCrrrAGGGTTXXXjATTrAGAGCTITAOGG^ 

GTICAajrAGTCGG<XATC:XKXXnX3ATAGAC^ 

TAGTGGACIXrnxmCCAAACTXKjAACAACAC^ 

ATmXKXXxATTIXXXKXTATTGGTrAAAAAA 

aaggaaccggaacaogtagaaagccagtcxxk:agaaacggtgctcaccc^^ 
gctattrtcgacaagggaaaaoscaagcgcaaagagaaagcaggtagc^^ 
agcragactcggcggttttatggacagcaagcgaaccxkjaat^^ 
tgggaagocctxk:aaagtaaacixk5atxxk;ii ii;iio 

TCTGATX:AAGAGAC:7VGGATGAGGATCGTmXK:ATGAT^ 
CXXKnTXXKnX3GAGAGGCTATKXKXn'ATCAC^^ 
GTTOCGGCTGTCAGCGCAGGGOCXKXXXKillX^^ 
CTCM::AGGACX5AGGCAGCXXXKKn'ATCGTXKK:^^ 

TTGTCACTOAAGCGGGAAGGGACTGGCIXKn'AT^^ . 

OCnXKnXXTGOCGAGAAAGTATCCATCAlXK^^ 

ACX:nrGCCrATrCGACX:ACCAAGCGAAACATCGCAT^ 

TCXjATCAGGATGATCTGGACGAAGAGCATCAGGGGCTXXK:G^ 

GCGCATGiXCGACXSGCXJAGGATCTIXXnXZXnGAT^ 

AATXKKXGCrrmnXKJATTCAAOTACTC^^ 

TGGATACCCGTGATATTGCTCAAGAGCTTCKK^ 

CX3(XXKrrC0CGATIXaK:AGCGCATCXKXnT^ 

AGAGTATGAGTATTCAACArnXXXSTGTCGCCCITATrC^ 

TG<nx:A<XX:AGAAACGCTGGTGAAAGTAAAAGATGCroAAGA 

GAACnX5GATCnX:AACAG<:XKjTAAGATCXTrGAG 

CTmAAAGTKTIXKrrATGTCATACACrATrATCXX^ 

GGCXKXKn*ATIXnX:AGAATGACTIXKnTGAGTACrcAC^ 

ACAGTAAGAGAATrATXK:AGTXXTXKX::ATAACCATCAGTCATAACAC^^ 

CGATXXK^AGGACCGAAGGAGCTAACCGCrrrrrroCACA^ 

TIXXK3AAC0GGAGCroAATGAAGO(:VVTA0CAAAOGAa^ 

ACAACGTIXKXK:AAACTATrAACTGGCGAACTACrrAC^ 

TCOAGGOGGATAAAGTrGCAGGAa:ACTICIXK>KJr^ 

ATCriXKjAGCX:XK3TCAGCX3TXKKnC^ 

ATCXn'AGTTATCTACACXJACXHSGGAGTCAGGCAACrATXKSATG^ 

GTGanx:ACroATTAAGCATTXKn'AACIXjrc 

ACrrcATTITTAAriTAAAAGGATCTAGGTCAAGATCXnTI^^ 

cgtcagttitcgtixx:actgagcgtcagaoccc^ 

TJUIXXXXXJTAATCTGCreCTI^ 

AGAGCTAOCAACIXriTITltXXjAAGGTAACTX^ 

GTXn^AGCCGTAGTTAGGCCACCACITCAAGAACTXnxJTAG^ 

TCnTACrAGTGGCIXXnXKX:AGTXKKX5ATAAGT0GTGT^^ 

GGATAAGGCGCAGCGGTCX3GGCTCAA<XGGGGGTITO 

AOOGAACnGAGATAOCTACAGCGTOAGCATTOAGAAAGCXKX^ 

GGTATOCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCAOSAGOT 

TCTrrATACnXXTCTCGGO"riiUXX:A<XT^ 

cxk?agcctatggaaaaacxxx:agcaacgoggcc^^ 
ACATcnTcrmxnxKXJTTATcxxcroATixr^^ 

CXKnx:XKXXK:AGCa5AACXjACCGAGCGCAGCGAGTCAG^ 
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TOPO 



CCCTT 
GGGAATAAG 



M13 
reverse 



AAGGG 
jTCC C 



TOPO 



z 



T7 

promoter 



ColEl 
ORI 
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forward 

HE 



LINEARIZED, TOPO 
CHARGED, FLAP 
VECTOR. 

Modified pCR 2.1 
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forward 
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SEQUENCE LISTING ...-^ 
<110> INVTTROGEN CORPORATION 
CHESTNUT, John 
SHUMAN, Stewart " 
HEYMAN, John 
MADDEN, Knut 
BENNETT, Rob 

<120> METHODS AND REAGENTS FOR MOLECULAR CLONING 

<130> INVIT1300WO 

<150> US 60/226,563 
<151> 2000-08-21 

<160> 18 

<170> Patentin version 3.0 

<210> 1 
<211> 12 
<212> DNA 

<213> Artificial sequence 
<220> - 

<223> Vaccinia topoisomerase cleavable sequence 

<400> 1 
gcccttattc cc 



<210> 2 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Vaccinia topoisomerase cleavable sequence 

<400> 2 
tcgcccttat tc 



<210> 3 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Vaccinia topoisomerase cleavable sequence 

<400> 3 
tgtcgccctt at 



<210> 4 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 



1 
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<223> Vaccinia topoisomerase cleavable sequence 



<400> 4 
gtgtcgccct ta 



12 



<210> 5 

<211> 28 

<212> DNA 

<213> Artificial sequence 
<220> 

<:223> Adapter oligonucleotide 

<400> 5 

aattgatccc ttcaccgaca tagtacag 28 



<210> 6 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Adapter oligonucleotide 



<210> 7 

<211> 11 

<212> DNA 

<213> Artificial sequence 

<220> 

<223> Adapter oligonucleotide 

<400> 7 

aagggcgagc t 11 



<210> 8 

<211> 19 

<212> DNA 

<213> Artificial sequence 

<220> 

<223> Adapter oligonucleotide 



<210> 9 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Intermediate vector overhang sequence 



<400> 6 
ggtgaaggga tc 



12 



<:400> 8 

cgcccttgac atagtacag 



19 



2 
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<400> 9 
gacatagtac ag 



12 



<210> 10 

<211> 15 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Annealing oligonucleotide 

<400> 10 

caactgtact atgtc 15 



<210> 11 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Adopter oligonucleotide 



<210> 12 

<211> 11 

<212> DNA 

<213> Artificial seq[uence 
<220> 

<223> Adopter oligonucleotide 

<400> 12 

gaataagggc g 11 



<210> 13 

<211> 23 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Adopter oligonucleotide 



<210> 14 

<211> 12 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Intermediate vector overhang sequence 



<400> 11 

agctcgccct tattccgata gtg 



23 



<400> 13 

aattcgccct tattccgata gtg 



23 
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<400> 14 

attccgatag tg 12 



<210> 15 

<211> 15 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Annealing oligonucleotide 

<400> 15 

caacactatc ggaat 15 



<210> 16 
<211> 2290 

<212> DNA 

<213> Artificial sequence 








<220> 

<223> pUni/V5-His version A vector 






<400> 16 
aattcccatg 


tcagccgtta 


agtgttcctg 


tgtcactcaa 


aattgctttg agaggctcta 


60 


agggcttctc 


agtgcgttac 


atccctggct 


tgttgtccac 


aaccgttaaa ccttaaaagc 


120 


tttaaaagcc 


ttatatattc 


ttttttttct 


tataaaactt 


aaaaccttag aggctattta 


180 


agttgctgat 


ttatattaat 


tttattgttc 


aaacatgaga 


gcttagtacg tgaaacatga 


240 


gagcttagta 


cgttagccat 


gagagcttag 


tacgttagcc 


atgagggttt agttcgttaa 


300 


acatgagagc 


ttagtacgtt 


aaacatgaga 


gcttagtacg 


tgaaacatga gagcttagta 


360 


cgtactatca 


acaggttgaa 


ctgctgatca 


acagatcctc 


tacgcggccg cggtaccata 


420 


acttcgtata 


gcatacatta 


tacgaagtta 


tcggaggaat 


tggctcgagg aattcaccgg 


480 


tgccgtgtgg 


gcggatccgg 


gcccgacgtc 


aggcctcgat 


cggagctcgg taagcctatc 


540 


cctaaccctc 


tcctcggtct 


cgattctagc 


catcatcacc 


atcaccattg aagctcgcta 


600 


tcagcctcga 


ctgtgccttc 


tagttgccag 


ccatctgttg 


tttgcccctc ccccgtgcct 


660 


tccttgaccc 


tggaaggtgc 


cactcccact 


gtcctttcct 


aataaaatga ggaaattgca 


720 


tcgcattgtc 


tgagtaggtg 


tcattctatt 


ctggggggtg 


gggtggggca ggacagcaag 


780 


ggggaggatt 


gggaagacaa 


tagcaggcat 


gctggggatt 


ctagaagatc cggctgctaa 


840 


caaagcccga 


aaggaagctg 


agttggctgc 


tgccaccgct 


gagcaataac tagcataacc 


900 


ccttggggcc 


tctaaacggg 


tcttgagggg 


ttttttgctg 


aaaggaggaa ctatatccgg 


960 


atatcccggg 


gtgggcgaag 


aactccagca 


tgagatcccc 


gcgctggagg atcatccagc 


1020 


cggcgtcccg 


gaaaacgatt 


ccgaagccca 


acctttcata 


gaaggcggcg gtggaatcga 


1080 
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aatctcgtga 


tggcaggttg 


ggcgtcgctt ggtcggtcat ttcgaacccc agagtcccgc 


1140 


tcagaagaac 


tcgtcaagaa 


ggcgatagaa ggcgatgcgc tgcgaatcgg gagcggcgat 


1200 


accgtaaagc 


acgaggaagc 


ggtcagccca ttcgccgcca agctcttcag caatatcacg 


1260 


ggtagccaac 


gctatgtcct 


gatagcggtc cgccacaccc agccggccac agtcgatgaa 


1320 


tccagaaaag 


cggccatttt 


ccaccatgat attcggcaag caggcatcgc catgtgtcac 


1380 


gacgagatcc 


tcgccgtcgg 


gcatgcgcgc cttgagcctg gcgaacagtt cggctggcgc 


1440 


gagcccctga 


tgctcttcgt 


ccagatcatc ctgatcgaca agaccggctt ccatccgagt 


1500 


acgtgctcgc 


tcgatgcgat 


gtttcgcttg gtggtcgaat gggcaggtag ccggatcaag 


1560 


cgtatgcagc 


cgccgcattg 


catcagccat gatggatact ttctcggcag gagcaaggtg 


1620 


agatgacagg 


agatcctgcc 


ccggcacttc gcccaatagc agccagtccc ttcccgcttc 


1680 


agtgacaacg 


tcgagcacag 


ctgcgcaagg aacgcccgtc gtggccagcc acgatagccg 


1740 


cgctgcctcg 


tcctgcagtt 


cattcagggc accggacagg tcggtcttga caaaaagaac 


1800 


cgggcgcccc 


tgcgctgaca 


gccggaacac ggcggcatca gagcagccga ttgtctgttg 


1860 


tgcccagtca 


tagccgaata 


gcctctccac ccaagcggcc ggagaacctg cgtgcaatcc 


1920 


atcttgttca 


atcatgcgaa 


acgatcctca tcctgtctct tgatcagatc ttgatcccct 


1980 


gcgccatcag 


atccttggcg 


gcaagaaagc catccagttt actttgcagg gcttcccaac 


2040 


cttaccagag 


ggcgccccag 


ctggcaattc cggttcgctt gctgtccata aaaccgccca 


2100 


gtctagctat 


cgccatgtaa 


gcccactgca agctacctgc tttctctttg cgcttgcgtt 


2160 


ttcccttgtc 


cagatagccc 


agtagctgac attcatccgg ggtcagcacc gtttctgcgg 


2220 


actggctttc 


tacgtgttcc 


gcttccttta gcagcccttg cgccctgagt gcttgcggca 


2280 


gcgtgaagct 






2290 



<210> 17 
<211> 3906 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> pCR2.1 vector 
<400> 17 

agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa tgcagctggc 60 
acgacaggtt tcccgactgg aaag'cgggca gtgagcgcaa cgcaattaat gtgagttagc 120 
tcactcatta ggcaccccag gctttacact ttatgcttcc ggctcgtatg ttgtgtggaa 180 
ttgtgagcgg ataacaattt cacacaggaa acagctatga ccatgattac gccaagcttg 240 
gtaccgagct cggatccact agtaacggcc gccagtgtgc tggaattcgg cttaagccga 300 



5 



wo 02/16594 



PCTAJSOl/26294 



attctgcaga 


tatccatcac 


actggcggcc 


gctcgagcat 


gcatctagag 


ggcccaattc 


360 


gccctatagt 


gagtcgtatt 


acaattcact 


ggccgtcgtt 


ttacaacgtc 


gtgactggga 


420 


aaaccctggc 


gttacccaac 


ttaatcgcct 


tgcagcacat 


ccccctttcg 


ccagctggcg 


480 


taatagcgaa 


gaggcccgca 


ccgatcgccc 


ttcccaacag 


ttgcgcagcc 


tgaatggcga 


540 


atgggacgcg 


ccctgtagcg 


gcgcattaag 


cgcggcgggt 


gtggtggtta 


cgcgcagcgt 


600 


gaccgctaca 


cttgccagcg 


ccctagcgcc 


cgctcctttc 


gctttcttcc 


cttcctttct 


660 


cgccacgttc 


gccggctttc 


cccgtcaagc 


tctaaatcgg 


gggctccctt 


tagggttccg 

333 ""^^^ 


720 


atttagagct 


ttacggcacc 


tcgaccgcaa 


aaaacttgat 


ttgggtgatg 


gttcacgtag 


780 


tgggccatcg 


ccctgataga 


cggtttttcg 


ccctttgacg 


ttggagtcca 


cgttctttaa 


840 


tagtggactc 


ttgttccaaa 


ctggaacaac 


actcaaccct 


atcgcggtct 


attcttttga 


900 


tttataaggg 


attttgccga 


tttcggccta 


ttggttaaaa 


aatgagctga 


tttaacaaat 


960 


tcagggcgca 


agggctgcta 


aaggaaccgg 


aacacgtaga 


aagccagtcc 


gcagaaacgg 

0 3 3 3 


1020 


tgctgacccc 


ggatgaatgt 


cagctactgg 


gctatctgga 


caagggaaaa 


cgcaagcgca 


1080 


aagagaaagc 


aggtagcttg 


cagtgggctt 


acatggcgat 


agctagactg 


ggcggtttta 


1140 


tggacagcaa 


gcgaaccgga 


attgccagct 


ggggcgccct 


ctggtaaggt 


tgggaagccc 


1200 


tgcaaagtaa 


actggatggc 


tttcttgccg 


ccaaggatct 


gatggcgcag 

^ 33 W 


gggatcaaga 


1260 


tctgatcaag 


agacaggatg 


aggatcgttt 


cgcatgattg 


aacaagatgg 


attgcacgca 


1320 


ggttctccgg 


ccgcttoggt 


ggagaggcta 

33 3 33^ 


ttcggctatg 


actgggcaca 


acagacaatc 


1380 


ggctgctctg 


atgccgccgt 


gttccggctg 

3 ••»'^"33^»3 


tcagcacagg 


QQcgcccggt 

33^3*'*'^33 


tctttttgtc 


1440 


aagaccgacc 


tgtccggtgc 


cctgaatgaa 


ctgcaggacg 

^ 3^ 3 


aggcagcgcg 

33 3 ^ J 


gctatcgtgg 


1500 


ctggccacga 


cgggcgttcc 


ttgcgcagct 


gtgctcgacg 


ttgtcactga 


agcgggaagg 


1560 


gactggctgc 


tattgggcga 


agtgccgggg 


caggatctcc 


tgtcatctcg 


ccttgctcct 


1620 


gccgagaaag 

3 ^ ^ 3 


tatccatcat 


ggctgatgca 

33 ^ ^ 


atgcggcggc 


tgcatacgct 


tgatccggct 


1680 


acctgcccat 


tcgaccacca 


agcgaaacat 


cgcatcgagc 


gagcacgtac 


tcggatggaa 


1740 


gccggtcttg 


tcgatcagga 


tgatctggac 


gaagagcatc 


aggggctcgc 

3333 3 


gccagccgaa 


1800 


ctgttcgcca 


ggctcaaggc 


gcgcatgccc 


gacggcgagg 


atctcgtcgt 


gatccatggc 


1860 


gatgcctgct 


tgccgaatat 


catggtggaa 


aatggccgct 


tttctggatt 


caacgactgt 


1920 


ggccggctgg 


gtgtggcgga 


ccgctatcag 


gacatagcgt 


tggatacccg 


tgatattgct 


1980 


gaagagcttg 


gcggcgaatg 


ggctgaccgc 


ttcctcgtgc 


tttacggtat 


cgccgctccc 


2040 


gattcgcagc 


gcatcgcctt 


ctatcgcctt 


cttgacgagt 


tcttctgaat 


tgaaaaagga 


2100 
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agagtatgag 


tattcaacat 


ttccgtgtcg 


ttcctgtttt 


tgctcaccca 


gaaacgctgg 


gtgcacgagt 


gggttacatc 


gaactggatc 


gccccgaaga 


acgttttcca 


atgatgagca 


tatcccgtat 


tgacgccggg 


caagagcaac 


acttggttga 


gtactcacca 


gtcacagaaa 


aattatgcag 


tgctgccata 


accatgagtg 


cgatcggagg 


accgaaggag 


ctaaccgctt 


gccttgatcg 


ttgggaaccg 


gagctgaatg 


cgatgcctgt 


agcaatgcca 


acaacgttgc 


tagcttcccg 


gcaacaatta 


atagactgga 


tgcgctcggc 


ccttccggct 


ggctggttta 


ggtctcgcgg 


tatcattgca 


gcactggggc 


tctacacgac 


ggggagtcag 


gcaactatgg 


gtgcctcact 


gattaagcat 


tggtaactgt 


ttgatttaaa 


acttcatttt 


taatttaaaa 


tcatgaccaa 


aatcccttaa 


cgtgagtttt 


agatcaaagg 


atcttcttga 


gatccttttt 


aaaaaccacc 


gctaccagcg 


gtggtttgtt 


cgaaggtaac 


tggcttcagc 


agagcgcaga 


agttaggcca 


ccacttcaag 


aactctgtag 


tgttaccagt 


ggctgctgcc 


agtggcgata 


gatagttacc 


ggataaggcg 


cagcggtcgg 


gcttggagcg 


aacgacctac 


accgaactga 


ccacgcttcc 


cgaagggaga 


aaggcggaca 


gagagcgcac 


gagggagctt 


ccagggggaa 


ttcgccacct 


ctgacttgag 


cgtcgatttt 


ggaaaaacgc 


cagcaacgcg 


gcctttttac 


acatgttctt 


tcctgcgtta 


tcccctgatt 


gagctgatac 


cgctcgccgc 


agccgaacga 


cggaag 
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cccttattcc 


cttttttgcg 


gcattttgcc 


2160 


tgaaagtaaa 


agatgctgaa 


gatcagttgg 


2220 


tcaacagcgg 


taagatcctt 


gagagttttc 


2280 


cttttaaagt 


tctgctatgt 


catacactat 


2340 


tcggtcgccg 


ggcgcggtat 


tctcagaatg 


2400 


agcatcttac 


ggatggcatg 


acagtaagag 


2460 


ataacactgc 


ggccaactta 


cttctgacaa 


2520 


ttttgcacaa 


catgggggat 


catgtaactc 


2580 


aagccatacc 


aaacgacgag 


agtgacacca 


2640 


gcaaactatt 


aactggcgaa 


ctacttactc 


2700 


tggaggcgga 


taaagttgca 


ggaccacttc 


2760 


ttgctgataa 


atctggagcc 


ggtgagcgtg 


2820 


cagatggtaa 


gccctcccgt 


atcgtagtta 


2880 


atgaacgaaa 


tagacagatc 


get gaga tag 


2940 


cagaccaagt 


ttactcatat 


atactttaga 


3000 


ggatctaggt 


gaagatcctt 


tttgataatc 


3060 


cgttccactg 


agcgtcagac 


cccgtagaaa 


3120 


ttctgcgcgt 


aatctgctgc 


ttgcaaacaa 


3180 


tgccggatca 


agagctacca 


actctttttc 


3240 


taccaaatac 


tgtccttcta 


gtgtagccgt 


3300 


caccgcctac 


atacctcgct 


ctgctaatcc 


3360 


agtcgtgtct 


taccgggttg 


gactcaagac 


3420 


gctgaacggg 


gggttcgtgc 


acacagccca 


3480 


gatacctaca 


gcgtgagcat 


tgagaaagcg 


3540 


ggtatccggt 


aagcggcagg 


gtcggaacag 


3600 


acgcctggta 


tctttatagt 


cctgtcgggt 


3660 


tgtgatgctc 


gtcagggggg 


cggagcctat 


3720 


ggttcctggc 


cttttgctgg 


ccttttgctc 


3780 


ctgtggataa 


ccgtattacc 


gcctttgagt 


3840 


ccgagcgcag 


cgagtcagtg 


agcgaggaag 


3900 



3906 
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<210> 18 
<211> 310 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> pUNI vector nrultiple cloning site 
<400> 18 

gagcttagta cgtactatca acaggttgaa ctgctgatca acagatcctc tacgcggccg 60 
cggtaccata acttcgtata gcatacatta tacgaagtta tcggaggaat tggctcgagg 120 
aattcaccgg tgccgtgtgg gcggatccgg gcccgacgtc aggcctcgat cggagctcgg 180 
taagcctatc cctaaccctc tcctcggtct cgattctagc catcatcacc atcaccattg 240 
aagctcgcta tcagcctcga ctgtgccttc tagttgccag ccatctgttg tttgcccctc 300 
ccccgtgcct 310 
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